Researchers propose Progressive Code-Switching (PCS), a framework that transfers Large Reasoning Models' English capabilities to other languages without relying on costly distillation from stronger models or external judges. PCS constructs code-switched reasoning traces by translating a subset of English steps into the target language and uses supervised fine-tuning to initialize this ability.
The method applies reinforcement learning with a step-level language consistency curriculum, progressively increasing the target-language ratio until the model reasons entirely in that language.
Experiments on multiple benchmarks and five typologically diverse languages show that PCS substantially narrows the performance gap between target-language and English reasoning while maintaining competitive accuracy.