Tag
This paper introduces AdaMame, a two-stage training recipe (SFT + GRPO) to adaptively align reasoning language with query language in multilingual mathematical reasoning, mitigating language collapse without sacrificing accuracy.