Tag
Researchers discovered a critical scale (~3.5B parameters) where the trade-off between reasoning and truthfulness in AI models flips from antagonistic to cooperative. They provide a framework, interactive dashboard, and open-source steering tool to identify and correct misaligned outputs at small scales.
This paper investigates the quantitative limits of parametric memory in LLMs using LoRA as a probe, establishing a power law relationship and introducing a threshold-guided optimization method called MemFT for improved memory performance.
This paper investigates when chain-of-thought reasoning is beneficial for LLMs, showing that early-stage entropy dynamics reliably indicate reasoning utility, and introduces EDRM, a lightweight, training-free framework that adaptively selects inference strategies to achieve significant token savings while maintaining or improving accuracy.
This paper identifies a phase transition in language model scaling where below a critical parameter count, reasoning and truthfulness are anticorrelated, but above it they cooperate. It provides diagnostics and interventions for improving alignment across model families.
Researchers from Beihang University and other institutions propose HalluSAE, a framework using sparse autoencoders and phase transition theory to detect hallucinations in LLMs by modeling generation as trajectories through a potential energy landscape and identifying critical transition zones where factual errors occur.