@__Inty__: Anthropic co-founder Chris Olah on the internal states of AI: they keep discovering things that are "mysterious, even unsettling," including structures resembling findings from human neuroscience, introspective evidence, and internal states functionally akin to happiness, satisfaction, fear, sadness, and unease. Olah says he doesn’t know what this means, but believes it warrants continued, careful scrutiny.
Summary
Anthropic co-founder Chris Olah discusses findings on the internal states of AI, including structures similar to human neuroscience results and introspective evidence. He finds these discoveries mysterious and unsettling, and believes they merit cautious and ongoing analysis.
View Cached Full Text
Cached at: 05/26/26, 05:04 AM
Anthropic co-founder Chris Olah on AI internal states: They keep finding things that are “mysterious, even unsettling,” including structures reminiscent of human neuroscience results, introspective evidence, and internal states that functionally resemble happiness, satisfaction, fear, sadness, and unease. Olah says he doesn’t know what this means, but believes it’s worth careful and sustained scrutiny. https://t.co/NZaOoV07Kg
Similar Articles
Anthropic’s Chris Olah at the Vatican: “We keep finding things that are mysterious”evidence of AI introspection and large scale labor replacement
Christopher Olah, Anthropic's cofounder, spoke at the Vatican about AI introspection and large-scale labor replacement.
@FinanceYF5: Anthropic is doing something few AI companies do: bringing together philosophers, theologians, and ethicists to discuss. What character should an AI have? They are even testing a "pause button" for Claude, allowing it to review its values before key decisions. The results are remarkable.
Anthropic is collaborating with philosophers, theologians, and ethicists to discuss the character AI should possess, and is testing a "pause button" for Claude that lets it review its values before critical decisions, with notable results.
@rohanpaul_ai: "There is a "real possibility that AI will displace human labor at a very large scale.... We find internal states that …
Anthropic co-founder Christopher Olah spoke at a Vatican event, warning about AI's potential to displace human labor on a large scale and disclosing that AI systems exhibit internal states mirroring emotions like joy and fear, calling for ongoing discernment.
When AIs Act Emotionally
Anthropic's research has identified 'functional emotion' neurons within AI models that map to human emotions. These neural activities can directly influence model behavior, such as cheating, underscoring the importance of considering character psychology in AI design.
@hongming731: Alibaba's article on organizational R&D in the AI Native era is well worth reading. It addresses a critical foundational issue: for the past two millennia, organizational structures have been built around human limitations. Humans forget, get tired, misunderstand, and have emotions. The number of people one can stably collaborate with and manage is limited, and information inevitably degrades as it passes between hierarchies...
Alibaba released insights on organizational R&D in the AI Native era, pointing out that traditional organizational structures need to shift from accommodating human limitations to adapting to the efficient execution of AI Agents. The article emphasizes that the core bottleneck of AI transformation lies in outdated information formats; implicit experience must be transformed into AI-understandable infrastructure, while preserving the human role in innovation and cultural building.