Tag
The paper proposes High-Entropy Sum (HES), a training-free metric for selecting high-quality reasoning data for LLM training, validated across SFT, RFT, and RL paradigms.