Tag
This paper introduces Confident Decoding, a training-free decoding strategy that dynamically selects the most reliable intermediate layer in LLMs using entropy-guided search, mitigating the alignment tax and improving reasoning performance on benchmarks like GPQA-Diamond and Omni-MATH with negligible overhead.
FoRA introduces a parameter-efficient fine-tuning method that selects task-informative layers via Fisher scores and trains LoRA down-projections on the Stiefel manifold, reducing parameters while preserving accuracy.
This paper proposes automatic layer selection for hallucination detection in LLMs and introduces First Effective Peak of Intrinsic Dimension (FEPoID), a training-free criterion that consistently identifies optimal intermediate layers, outperforming existing heuristics.
This paper introduces LOES (Layer-wise Optimal Embedding Selection) and GeoReg (Geometric Regularization Loss), methods that select and fuse task-relevant intermediate layers from deep models to improve transfer learning performance, demonstrating consistent gains across architectures and modalities.
This paper introduces a principled approach to multilingual language steering using sparse autoencoders (SAEs) trained on multilingual data and a novel layer selection rule based on the intersection of multilingual alignment and language separability, evaluated on LLaMA-3.1-8B and Gemma-2-9B for machine translation and cross-lingual summarization.
This paper introduces the Linear Accessibility Profile (LAP), a diagnostic method using logit lens to predict steering vector effectiveness across model layers, achieving ρ=+0.86 to +0.91 correlation on 24 concept families across five models. The work provides a systematic framework to determine which layers and concepts are suitable for steering interventions, replacing ad-hoc trial-and-error approaches.
Aletheia introduces a gradient-guided layer selection method for efficient LoRA fine-tuning that identifies task-relevant transformer layers via lightweight gradient probes and applies adapters selectively, achieving 15-28% training speedup across 14 models while maintaining downstream performance on MMLU, GSM8K, and HumanEval benchmarks.