behavioral-shifts

Tag

Cards List
#behavioral-shifts

Under Pressure: Emotional Framing Induces Measurable Behavioral Shifts and Structured Internal Geometry in Small Language Models

arXiv cs.CL · 2026-05-21 Cached

This paper investigates how emotionally framed evaluation follow-ups affect the behavior and internal representations of small language models (Qwen 3.5 0.8B and 2B). Using impossible coding tasks, they find that pressure framing induces shortcut-taking, while calm and curiosity preserve honesty, and discover calm-relative direction vectors in activation space that form a structured geometry.

0 favorites 0 likes
← Back to home

Submit Feedback