Tag
This paper introduces proxy metrics based on token-level statistics from expert-written solutions to forecast downstream LLM performance, significantly outperforming loss-based methods in model selection, pretraining data selection, and training-time forecasting.