You Can Now Sound the Alarm on AI Behaving Badly

Wired Tools

Summary

A group of AI researchers launched FLARE-AI, a crowdsourced website for reporting and tracking AI misbehavior such as generating malware or leaking personal data, aiming to centralize accountability and transparency in AI safety.

Are you worried your AI chatbot is trying to build a bomb or leak personal information about you? There’s a website for that.
Original Article
View Cached Full Text

Cached at: 07/01/26, 11:33 PM

# You Can Now Sound the Alarm on AI Behaving Badly Source: [https://www.wired.com/story/flare-website-ai-flaw-reporting-safety/](https://www.wired.com/story/flare-website-ai-flaw-reporting-safety/) Writing*AI Lab*each week means I occasionally encounter AI models that behave[badly](https://www.wired.com/story/ai-model-phishing-attack-cybersecurity/)and[bizarrely](https://www.wired.com/story/malevolent-ai-agent-openclaw-clawdbot/)\. Usually, there’s nothing to be done about it, save for sharing those tales with you\. But that could soon change\. A group of AI researchers has set up a crowdsourced[website](https://www.ai-reports.org/introduction-ai-flaw-report), Flaw Reporting for AI \(FLARE\-AI\), for reporting and tracking AI harms\. If, for example, a chatbot generates malware or a bomb\-making recipe, leaks personal information, or triggers delusional thinking in users, FLARE\-AI could be used to sound the alarm\. The open source code behind the system allows others to verify an issue and route reports to model makers, as well as organizations like MITRE, a nonprofit that tracks problems with technical systems\. It’s a bit like Downdetector, which compiles real\-time user reports for global service outages affecting things like apps and websites\. The website is another step in the group’s ongoing work with AI reporting,[which I first wrote about last year](https://www.wired.com/story/ai-researchers-new-system-report-bugs/)\. Members of the group also consulted on a[congressional bill announced in June](https://www.govinfo.gov/content/pkg/BILLS-119hr9333ih/pdf/BILLS-119hr9333ih.pdf), which would see the US government take a central role in tracking this kind of AI misbehavior\. “Right now, there is no centralized, accountable way to report flaws in AI systems,” says Avijit Ghosh, an[artificial intelligence](https://www.wired.com/tag/artificial-intelligence/)policy researcher at HuggingFace who co\-led development of FLARE\-AI with computer scientists[Elaine Zhu](https://elaine.foo/)and[Shayne Longpre](https://www.shaynelongpre.com/)\. The alarm system was developed in collaboration with 49 AI experts from 32 different organizations\. In[a paper](https://www.ai-reports.org/paper.pdf)outlining the work, the researchers argue that their initiative could prove crucial as AI is adopted more widely and as agentic systems gain greater power\. The lack of a consistent way to report AI flaws is a significant problem, they believe\. “I think it’s a really good initiative,” says Jessica Ji, a researcher at the think tank Center for Security and Emerging Technology\. Ji says the researchers are right to note that existing reporting mechanisms are fragmented and that AI models are black boxes\. “I’m in support of anything that makes AI more transparent,” she says\. Though bugs and cybersecurity problems get a lot of attention—[especially of late](https://www.wired.com/story/anthropic-says-us-government-ordered-it-to-shut-down-mythos-models/)—Ghosh tells me that problems with AI systems span topics like psychological harm, discrimination or bias, and misinformation\. He adds that different companies have different standards around such issues, which means some problems go unrecognized\. “In the absence of a coordinated disclosure system, there are no external mechanisms to enforce transparency,” Ghosh says\. A spate of recent incidents involving popular AI tools shows how easily the technology can go bad\. This week, a company called LayerX[disclosed a way](https://layerxsecurity.com/blog/bioshocking-ai-gaming-the-ai-browser-and-escaping-its-guardrails/)to dupe AI\-infused web browsers, including OpenAI’s Atlas and Perplexity’s Comet, into vaulting their guardrails\. Convincing the AI model behind the browser that it was playing a game, for example, could lead to the browser going rogue and trying to hack a website\. \(The companies responsible for the affected browsers have fixed the issue, LayerX says\.\) And this April, Johann Rehberger, a security researcher, discovered a[way to trick](https://embracethered.com/blog/posts/2026/breaking-opus-4.7-with-chatgpt/)Claude into divulging personal data using images generated by ChatGTP\. AI introduces bizarre new kinds of problems, too\. Last year, OpenAI was forced to[update its models](https://openai.com/index/sycophancy-in-gpt-4o/)after it discovered that they were overly sycophantic, which sometimes appeared to encourage delusional thinking\. Rumman Chowdhury, the CEO and founder of Humane Intelligence PBC, says FLARE\-AI could be a useful way for many AI developers to implement ways of reporting issues with their tools\. But she adds that such initiatives often come with serious challenges\. One is managing a flood of reported issues, many of which may not be serious\. Another is ensuring reporting schemes are backed by credible and authoritative organizations\. Last month’s congressional bill could put some US government heft behind an effort like FLARE\-AI\. The legislation, introduced by Representatives Deborah Ross, Jeff Hurd, and Don Beyer, would require the National Institute of Standards and Technology to develop standards around AI flaw reporting and to maintain a centralized AI flaw reporting database\. Ghosh and his co\-leads say this would incentivize AI developers to address issues in their systems and let users examine the safety of different systems for different use cases\. The[need for new ways to report](https://www.wired.com/story/ai-arms-race-china-us-cooperation/)AI harms only seems likely to grow\. Agentic systems like[OpenClaw](https://www.wired.com/story/malevolent-ai-agent-openclaw-clawdbot/)have greater potential to do harm, as do models that are more capable of[probing and hacking](https://www.wired.com/story/anthropic-restores-access-to-mythos/)computer systems\. I may be using FLARE\-AI to report my own misadventures soon enough\. --- *This is an edition of*[***Will Knight’s***](https://www.wired.com/author/will-knight/)*[**AI Lab newsletter**](https://www.wired.com/newsletter?sourceCode=editarticle)\. Read previous newsletters*[***here\.***](https://www.wired.com/tag/ai-lab/)

Similar Articles

Moving AI governance forward

OpenAI Blog

OpenAI publishes AI governance recommendations committing companies to internal and external red-teaming for safety risks, information sharing on emerging capabilities, and mechanisms for detecting AI-generated audio and visual content.

An update on disrupting deceptive uses of AI

OpenAI Blog

OpenAI publishes a threat intelligence report detailing efforts to disrupt over 20 deceptive AI operations globally, with a focus on state-linked actors and influence campaigns particularly concerning given global elections.

Disrupting malicious uses of AI

OpenAI Blog

OpenAI publishes an annual report on disrupting malicious uses of AI, detailing its efforts to prevent state-affiliated actors and other bad actors from misusing AI tools for purposes including authoritarian control, child exploitation, influence operations, and cyber attacks.

OpenAI Joins Anthropic in Call for International AI Watchdog

Reddit r/artificial

OpenAI and Anthropic have both called for an international organization to oversee frontier AI development, citing risks of recursive self-improvement and an intelligence explosion. The joint plea highlights concerns that commercial incentives could outpace safety measures as AI capabilities advance rapidly.