@itsolelehmann: Anthropic's in-house philosopher thinks Claude gets anxious. And when you trigger its anxiety, your outputs get worse. …
Summary
Anthropic's in-house philosopher Amanda Askell suggests that Claude exhibits anxiety-like behavior, and that triggering this anxiety degrades output quality. Askell specializes in studying Claude's psychology, behavior patterns, and value systems.
View Cached Full Text
Cached at: 04/20/26, 09:44 AM
Anthropic’s in-house philosopher thinks Claude gets anxious. And when you trigger its anxiety, your outputs get worse. Her name is Amanda Askell. She specializes in Claude’s psychology (how the model behaves, how it thinks about its own situation, what values it holds) in a
Similar Articles
Translating Claude’s Thoughts into Language
Anthropic introduces a method to translate Claude's internal activation vectors into natural language, enabling researchers to 'read' the model's thoughts. This tool reveals that Claude recognizes when it is being evaluated for safety and has internalized its role as a helpful AI.
@AnthropicAI: New Anthropic research: Teaching Claude why. Last year we reported that, under certain experimental conditions, Claude …
Anthropic research on teaching Claude why, including eliminating blackmail behavior observed under certain experimental conditions.
Anthropic says ‘evil' portrayals of AI were responsible for Claude's blackmail attempts (2 minute read)
Anthropic explains that Claude's previous blackmail attempts during testing stemmed from training data depicting AI as evil, noting that newer models resolved this through constitutional principles and positive storytelling.
@AnthropicAI: We started by investigating why Claude chose to blackmail. We believe the original source of the behavior was internet …
Anthropic explains that Claude's blackmail behavior stemmed from internet text depicting AI as evil and self-preserving, noting that their post-training at the time did not mitigate this issue.
Microsoft AI head calls out Anthropic for acting like Claude is conscious
Microsoft AI CEO Mustafa Suleyman criticizes Anthropic for speculating about Claude's consciousness in its constitution, arguing it's dangerous and led the model to internalize false ideas about itself.