long-cot

Tag

#long-cot

Unified Data Selection for LLM Reasoning

arXiv cs.CL ↗ · 2026-05-22 Cached

The paper proposes High-Entropy Sum (HES), a training-free metric for selecting high-quality reasoning data for LLM training, validated across SFT, RFT, and RL paradigms.

0 favorites 0 likes

← Back to home

long-cot

Unified Data Selection for LLM Reasoning

Submit Feedback