llm-control

#llm-control

When is Your LLM Steerable?

arXiv cs.CL ↗ · 21h ago Cached

This paper investigates when activation steering succeeds or fails for LLMs by analyzing early decoding dynamics. The authors introduce ASTEER, a large testbed of steered generations, and train a GBDT classifier to predict steering outcomes from early hidden states, enabling efficient steering strength search.

0 favorites 0 likes

#llm-control

When Is Rank-1 Steering Cheap? Geometry, Granularity, and Budgeted Search

arXiv cs.LG ↗ · 2026-05-19 Cached

This paper investigates when rank-1 activation steering is effective and cost-efficient, proposing geometry-guided search and the concept of granularity to explain variability, and introduces the GRACE framework for efficient LLM control.

0 favorites 0 likes

llm-control

When is Your LLM Steerable?

When Is Rank-1 Steering Cheap? Geometry, Granularity, and Budgeted Search

Submit Feedback