Tag
A researcher presents evidence that strong target text can induce a measurable latent-state shift in Gemma 3 12B IT before final output, distinct from lexical or content overlaps, and discusses implications for AI safety beyond output-only evaluation.
This paper investigates Hidden Layer Distillation (HLD) for Large Language Model pre-training, comparing it against standard logit-based knowledge distillation using Gemma3. The study finds that while HLD does not consistently outperform standard methods on downstream tasks, it yields systematic perplexity gains, suggesting potential for future improvements in extracting latent signals during pre-training.