token-statistics

Tag

Cards List
#token-statistics

Forecasting Downstream Performance of LLMs With Proxy Metrics

Hugging Face Daily Papers · 2026-05-18 Cached

This paper introduces proxy metrics based on token-level statistics from expert-written solutions to forecast downstream LLM performance, significantly outperforming loss-based methods in model selection, pretraining data selection, and training-time forecasting.

0 favorites 0 likes
#token-statistics

Token Statistics Reveal Conversational Drift in Multi-turn LLM Interaction

arXiv cs.CL · 2026-04-20 Cached

This paper introduces Bipredictability (P) and the Information Digital Twin (IDT), a lightweight method to monitor conversational consistency in multi-turn LLM interactions using token frequency statistics without embeddings or model internals. The approach achieves 100% sensitivity in detecting contradictions and topic shifts while establishing a practical monitoring framework for extended LLM deployments.

0 favorites 0 likes
← Back to home

Submit Feedback