activation-geometry

#activation-geometry

ICA Lens: Interpreting Language Models Without Training Another Dictionary

Hugging Face Daily Papers ↗ · 2d ago Cached

ICA Lens revives independent component analysis as an efficient method for interpreting language model representations, offering a faster alternative to sparse autoencoder training while maintaining competitive performance.

0 favorites 0 likes

#activation-geometry

When Is Rank-1 Steering Cheap? Geometry, Granularity, and Budgeted Search

arXiv cs.LG ↗ · 2026-05-19 Cached

This paper investigates when rank-1 activation steering is effective and cost-efficient, proposing geometry-guided search and the concept of granularity to explain variability, and introduces the GRACE framework for efficient LLM control.

0 favorites 0 likes

activation-geometry

ICA Lens: Interpreting Language Models Without Training Another Dictionary

When Is Rank-1 Steering Cheap? Geometry, Granularity, and Budgeted Search

Submit Feedback