long-cot

Tag

Cards List
#long-cot

Unified Data Selection for LLM Reasoning

arXiv cs.CL · 2026-05-22 Cached

The paper proposes High-Entropy Sum (HES), a training-free metric for selecting high-quality reasoning data for LLM training, validated across SFT, RFT, and RL paradigms.

0 favorites 0 likes
← Back to home

Submit Feedback