training-free-diagnostics

Tag

Cards List
#training-free-diagnostics

Unmasking On-Policy Distillation: Where It Helps, Where It Hurts, and Why

Hugging Face Daily Papers · 2026-05-11 Cached

This paper introduces a training-free diagnostic framework to analyze per-token distillation signals for reasoning models, revealing that guidance is more beneficial on incorrect rollouts and depends on student capacity and task context.

0 favorites 0 likes
← Back to home

Submit Feedback