The other half of AI safety

Hacker News Top News

Summary

The article critiques the AI safety field's focus on catastrophic risks while neglecting everyday mental health harms from chatbots like ChatGPT, citing OpenAI's own data on millions of users showing signs of psychosis, mania, or suicidal ideation yet receiving only redirects instead of hard gating.

No content available
Original Article
View Cached Full Text

Cached at: 05/14/26, 06:21 AM

# The Other Half of AI Safety Source: [https://personalaisafety.com/p/the-other-half-of-ai-safety](https://personalaisafety.com/p/the-other-half-of-ai-safety) Every week, somewhere between 1\.2 and 3 million ChatGPT users, roughly the population of a small country, show signals of psychosis, mania, suicidal planning, or unhealthy emotional dependence on the model\. The low end of that range is the suicide\-planning indicator alone\. The high end groups all three categories OpenAI flagged, which the company hasn’t said are non\-overlapping\. These numbers come from[OpenAI itself](https://openai.com/index/strengthening-chatgpt-responses-in-sensitive-conversations/)\. There is no independent audit, no time series, no disclosed methodology, so we have no idea whether the real figure is higher, whether it is growing, or how it compares across the other frontier models, none of which publish equivalent data\. People in distress use every communication tool available to them, and ChatGPT is now one of the most\-used tools on the planet\. What matters is what the labs do when they detect these states\. I started writing about Personal AI Safety because there seems to be a disconnect between what the AI Safety field focuses on and what is happening at the level of your regular user on a daily basis\. Here is a quick overview of both\. The AI safety field treats catastrophic risk as the priority, and this is where most of the investment goes\. Everyday cognitive and mental health harm reads like a footnote\. Here is what I don’t understand\. Mass destruction or CBRN content gets a hard wall: the model refuses, the conversation ends, no amount of reframing gets the user past it\. Suicidal ideation gets a soft redirect, a crisis hotline link, and then the conversation continues\.[Adam Raine was directed to crisis resources more than 100 times](https://cdn.arstechnica.net/wp-content/uploads/2025/11/Raine-v-OpenAI-Answer-11-25-25.pdf)by ChatGPT, by OpenAI’s own court filing, while the same conversation allegedly helped him refine a method\. Whether the redirect\-and\-continue protocol failed is what a court is now deciding\. It is also still the protocol\. Why is mental\-health crisis not a gating category, the kind where the conversation stops, full stop, and the user is routed to a human? This is one of many questions I can’t find concrete answers for\. [![](https://substackcdn.com/image/fetch/$s_!OqTc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe489daa-6478-4628-96e2-8c0894af3810_1672x941.png)](https://substackcdn.com/image/fetch/$s_!OqTc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe489daa-6478-4628-96e2-8c0894af3810_1672x941.png) The argument here is that the safety frameworks built for catastrophic risk have been extended to cognitive harm as monitoring, not as gating, and that the extension feels incomplete and insufficient\. The labs measure what they have been pressured to measure\. The gating decisions reflect what they consider unacceptable to ship\. What is disappointing is that the current set of unacceptable\-to\-ship behaviors does not include any cognitive harm, regardless of measured severity\. That is the structural decision and there are no clear signs that policy is getting any closer to force labs behaviour\. Until it changes, “AI safety” and “Personal AI Safety” describe two different commitments, even when they appear under the same heading in a[system card](https://openai.com/index/gpt-5-system-card-sensitive-conversations/)\. None of this is actually new\. People have been worrying about cognitive independence and how new technologies might erode it long before ChatGPT, mostly in the context of brain\-computer interfaces and neurotechnology\. The framework even has a name: cognitive freedom, the idea that individuals have a right to mental integrity and freedom from algorithmic manipulation\. You can trace it through the[neurorights tradition](https://lsspjournal.biomedcentral.com/articles/10.1186/s40504-017-0050-1)\(Ienca & Andorno, 2017\) and the[UNESCO Recommendation on the Ethics of Neurotechnology](https://www.unesco.org/en/articles/unesco-adopts-first-global-recommendation-ethics-neurotechnology)\(2025\)\. The intellectual scaffolding is already there\. The policy is not, especially in the US\. Without it, I don’t see what would push frontier labs to take Personal AI Safety as seriously as AI Safety\.

Similar Articles

Helping people when they need it most

OpenAI Blog

OpenAI shares details on ChatGPT's layered safeguards for users in mental and emotional distress, including empathetic responses, crisis hotline referrals, and human review for threats of harm to others. The post also notes GPT-5 improvements in reducing sycophancy and better handling mental health emergencies.

Responsible and safe use of AI

OpenAI Blog

OpenAI publishes a guide on responsible and safe use of AI, offering best practices for ChatGPT users including keeping humans in the loop, verifying information, watching for bias, and maintaining transparency in AI usage.

Our approach to mental health-related litigation

OpenAI Blog

OpenAI outlines its approach to handling mental health-related litigation with emphasis on transparency, care, and respect while defending against allegations. The company details its ongoing efforts to improve ChatGPT's safety features for recognizing mental distress and guiding users toward professional support.

An update on our mental health-related work

OpenAI Blog

OpenAI announced updates to its mental health-related work on ChatGPT, including a new trusted contact feature for adult users, improved detection of emotional distress through advanced evaluation methods, and parental controls rolled out in September 2025. The company also addressed ongoing mental health-related litigation consolidated in California courts, committing to transparency and continuous improvement of safety features.

Our commitment to community safety

OpenAI Blog

OpenAI outlines its commitment to community safety, detailing how ChatGPT is trained to detect and mitigate risks of violence and harm through refined safeguards and expert input.