Tag
CSRP proposes a three-stage framework combining continual pre-training, chain-of-thought supervised fine-tuning, and reinforcement learning with an efficiency-aware reward to address over-correction in Chinese grammatical error correction, achieving state-of-the-art results on the NACGEC benchmark.