llm-distillation

#llm-distillation

Rethinking the Role of Temperature in Large Language Model Distillation

arXiv cs.LG ↗ · 4d ago Cached

This paper reexamines the role of temperature in large language model distillation, revealing that temperature asymmetrically benefits forward KL divergence over reverse KL, allowing simple KL methods to match state-of-the-art distillation approaches at higher temperatures.

0 favorites 0 likes

#llm-distillation

Bounded Behavioral Indistinguishability for Black-Box LLM Distillation

arXiv cs.LG ↗ · 5d ago Cached

This paper introduces bounded behavioral indistinguishability, a formal framework for evaluating black-box LLM distillation beyond semantic similarity. Experiments on Qwen and Llama models show that distillation reduces but does not eliminate adversarial distinguishability, highlighting the need for category-aware evaluation.

0 favorites 0 likes

llm-distillation

Rethinking the Role of Temperature in Large Language Model Distillation

Bounded Behavioral Indistinguishability for Black-Box LLM Distillation

Submit Feedback