Tag
GRACE proposes a gradient-aligned method that scores individual reasoning steps to select the most valuable data for post-training, achieving 108.8% of full-data performance with only 20% of the data.