Tag
This paper reexamines the role of temperature in large language model distillation, revealing that temperature asymmetrically benefits forward KL divergence over reverse KL, allowing simple KL methods to match state-of-the-art distillation approaches at higher temperatures.
This paper introduces bounded behavioral indistinguishability, a formal framework for evaluating black-box LLM distillation beyond semantic similarity. Experiments on Qwen and Llama models show that distillation reduces but does not eliminate adversarial distinguishability, highlighting the need for category-aware evaluation.