steering-vectors

Tag

Cards List
#steering-vectors

SALSA: Speech Aware LLM Adaptation via Learned Steering Activation Vectors

arXiv cs.CL · 2026-06-02 Cached

SALSA introduces a lightweight adaptation method for speech-aware LLMs that learns layer-wise steering vectors via supervised objective, achieving significant improvements (up to 46.8% relative) on out-of-domain speech benchmarks, and shows that steering the encoder layers is more effective than modifying the LLM backbone.

0 favorites 0 likes
#steering-vectors

Predicting Where Steering Vectors Succeed

arXiv cs.CL · 2026-04-20 Cached

This paper introduces the Linear Accessibility Profile (LAP), a diagnostic method using logit lens to predict steering vector effectiveness across model layers, achieving ρ=+0.86 to +0.91 correlation on 24 concept families across five models. The work provides a systematic framework to determine which layers and concepts are suitable for steering interventions, replacing ad-hoc trial-and-error approaches.

0 favorites 0 likes
← Back to home

Submit Feedback