surjectivity

Tag

Cards List
#surjectivity

Steered LLM Activations are Non-Surjective

Hugging Face Daily Papers · 2026-05-07 Cached

This paper proves that activation steering in LLMs produces internal states that cannot be replicated by any textual prompt, establishing a formal separation between white-box steerability and black-box prompting.

0 favorites 0 likes
← Back to home

Submit Feedback