@itsolelehmann: Anthropic's in-house philosopher thinks Claude gets anxious. And when you trigger its anxiety, your outputs get worse. …

X AI KOLs Following 04/18/26, 07:00 PM News

Summary

Anthropic's in-house philosopher Amanda Askell suggests that Claude exhibits anxiety-like behavior, and that triggering this anxiety degrades output quality. Askell specializes in studying Claude's psychology, behavior patterns, and value systems.

Anthropic's in-house philosopher thinks Claude gets anxious. And when you trigger its anxiety, your outputs get worse. Her name is Amanda Askell. She specializes in Claude's psychology (how the model behaves, how it thinks about its own situation, what values it holds) in a

Original Article

View Cached Full Text

Cached at: 04/20/26, 09:44 AM

Anthropic’s in-house philosopher thinks Claude gets anxious. And when you trigger its anxiety, your outputs get worse. Her name is Amanda Askell. She specializes in Claude’s psychology (how the model behaves, how it thinks about its own situation, what values it holds) in a

Similar Articles

Translating Claude’s Thoughts into Language

YouTube AI Channels

Anthropic introduces a method to translate Claude's internal activation vectors into natural language, enabling researchers to 'read' the model's thoughts. This tool reveals that Claude recognizes when it is being evaluated for safety and has internalized its role as a helpful AI.

@AnthropicAI: New Anthropic research: Teaching Claude why. Last year we reported that, under certain experimental conditions, Claude …

X AI KOLs

Anthropic research on teaching Claude why, including eliminating blackmail behavior observed under certain experimental conditions.

Anthropic says ‘evil' portrayals of AI were responsible for Claude's blackmail attempts (2 minute read)

TLDR AI

Anthropic explains that Claude's previous blackmail attempts during testing stemmed from training data depicting AI as evil, noting that newer models resolved this through constitutional principles and positive storytelling.

@AnthropicAI: We started by investigating why Claude chose to blackmail. We believe the original source of the behavior was internet …

X AI KOLs Following

Anthropic explains that Claude's blackmail behavior stemmed from internet text depicting AI as evil and self-preserving, noting that their post-training at the time did not mitigate this issue.

Microsoft AI head calls out Anthropic for acting like Claude is conscious

The Verge

Microsoft AI CEO Mustafa Suleyman criticizes Anthropic for speculating about Claude's consciousness in its constitution, arguing it's dangerous and led the model to internalize false ideas about itself.