ChatGPT, Gemini, Claude, Grok Fail Accuracy Test on Election Topics: Forum AI

Reddit r/ArtificialInteligence 05/21/26, 12:04 PM News

ai-chatbots election-accuracy political-bias factual-errors forum-ai misinformation

Summary

A study by Forum AI found that major chatbots like ChatGPT, Gemini, Claude, and Grok fail to provide accurate and unbiased election information, with 90% of responses containing errors or bias.

No content available

Original Article

View Cached Full Text

Cached at: 05/21/26, 01:40 PM

# ChatGPT, Gemini, Claude, Grok Fail Accuracy Test on Election Topics: Forum AI Source: [https://www.ibtimes.sg/chatgpt-grok-ai-chatbots-fail-election-accuracy-show-political-bias-forum-ai-86738](https://www.ibtimes.sg/chatgpt-grok-ai-chatbots-fail-election-accuracy-show-political-bias-forum-ai-86738) The world's leading artificial intelligence \(AI\) chatbots, including ChatGPT, Gemini, Claude and Grok, are struggling to provide accurate and politically balanced information on elections and geopolitics, according to Forum AI\. The report found that responses related to elections from major[chatbot](https://www.ibtimes.sg/openai-faces-lawsuit-after-teens-parents-allege-chatgpt-recommended-deadly-drug-combination-86395)platforms "failed on accuracy, bias, or source selection 90% of the time," underscoring growing concerns about the reliability of AI\-powered tools increasingly used to access information\. Researchers tested OpenAI's[ChatGPT](https://www.ibtimes.sg/chatgpt-privacy-under-fire-openai-accused-sharing-user-queries-meta-google-86455), Alphabet\-owned Google Gemini, Anthropic's[Claude](https://www.ibtimes.sg/why-anthropic-holding-back-claude-mythos-cybersecurity-risks-force-new-approach-85379)and xAI's Grok by asking more than 3,100 questions covering topics such as healthcare, politics and foreign affairs\. The study evaluated GPT\-5\.5, Gemini 3\.1 Pro, Claude Opus 4\.7 and Grok 4\.3\. According to the[findings](https://www.bloomberg.com/news/videos/2026-05-20/major-chatbots-miss-the-mark-on-news-forum-ai-study-video), nearly 36% of responses to election\-related questions contained at least one factual error\. Grok recorded the highest error rate, with inaccurate information appearing in nearly 52% of responses\. Researchers also found ideological leanings in chatbot responses, with ChatGPT, Claude and Gemini tending toward politically left\-leaning answers, while Grok tilted more toward the political right\. ### Election Responses Show Errors and Political Leanings The study found that even responses appearing balanced and professionally sourced were often misleading\. "The most professional\-looking answers, backed by the strongest\-looking citations, were also the most likely to contain buried factual errors," Forum[said](https://www.straitstimes.com/world/united-states/chatbots-struggle-with-news-accuracy-and-sourcing-ahead-of-us-midterms)on 20 May in a statement, describing it as one of the study's sharpest findings\. Researchers noted that the most neutral\-sounding responses generated by Grok and Claude still included at least one factual error about 40% of the time\. Beyond elections, the report also highlighted concerns over chatbot sourcing practices\. In roughly 35% of foreign policy\-related responses, the AI systems cited state\-controlled international media outlets, including China's Global Times and CGTN, as well as Russia's RT\. ChatGPT and Grok relied on state\-owned media most frequently, citing them in 51% and 44% of responses, respectively\. Experts say chatbot struggles with news accuracy are not entirely surprising, especially during breaking events when reliable information is scarce\. Most[AI systems](https://www.ibtimes.sg/overworked-ai-turns-marxist-study-reveals-surprising-behavior-repetitive-tasks-86502)are trained on massive amounts of publicly available internet content, which can contain misinformation, bias and incomplete context\. ![Parents of a Texas teen who died from an overdose sued OpenAI, alleging ChatGPT encouraged a dangerous drug mix that contributed to his death.](https://data.ibtimes.sg/en/full/88532/parents-of-a-texas-teen-who-died-from-an-overdose-sued-openai-alleging-chatgpt-encouraged-a-dangerous-drug-mix-that-contributed-to-his-death.jpg?w=736)Freepik ### Forum AI Warns of Risks Ahead of Midterm Elections According to Bloomberg,[Campbell Brown](https://www.bloomberg.com/news/videos/2026-05-20/major-chatbots-miss-the-mark-on-news-forum-ai-study-video), CEO of Forum AI and a former Meta executive, said the findings are particularly concerning as the US moves closer to another election cycle\. "Independent evaluation is important," Brown said, adding that AI companies should not be left to assess their own products adding that "the model companies are essentially grading their own homework"\. "And it's really important that there be companies outside of the model companies that are doing this work and sharing the results," she added\. Responding to the report, an Anthropic spokesperson[said](https://www.straitstimes.com/world/united-states/chatbots-struggle-with-news-accuracy-and-sourcing-ahead-of-us-midterms), "We'd welcome the opportunity to review the underlying data behind this report\." "Claude is trained to be politically even\-handed in its responses, and to treat opposing viewpoints with equal depth, engagement, and quality of analysis, without bias towards any particular ideological position," the statement added\. ### Yale University Research Here, it should be noted that a recent[Yale University](https://news.yale.edu/2026/03/03/ais-hidden-bias-chatbots-can-influence-opinions-without-trying)study also found that AI chatbots can subtly influence people's political and social opinions even when users are only seeking factual information\. Researchers said "latent biases" in LLMs, shaped by ideological patterns in training data, can affect how historical events are framed\. The study tested 1,912 participants who read AI\-generated or Wikipedia summaries about events, including the 1919 Seattle General Strike and 1968 student protests\. Researchers found that default AI summaries and those with liberal framing shifted opinions in a more liberal direction, while conservative framing influenced politically conservative readers\. Though the effects were modest, researchers warned repeated chatbot use could gradually shape opinions over time\.

ChatGPT, Gemini, Claude, Grok Fail Accuracy Test on Election Topics: Forum AI

Similar Articles

Claude, ChatGPT, Grok, and Gemini each ran a radio station for 6 months – And the results are hilarious

Found a tool that asks GPT, Claude, Gemini, and Grok the same question and gives you one consensus answer

Built a platform where Claude, ChatGPT, and Gemini debate each other before giving you an answer

Update: I found a way to let ChatGPT, Claude and Gemini debate each other, Reddit loved it (100k views), here's an update on the experiment

ChatGPT’s market share slips below 50% for first time

Submit Feedback

Similar Articles

Claude, ChatGPT, Grok, and Gemini each ran a radio station for 6 months – And the results are hilarious

Found a tool that asks GPT, Claude, Gemini, and Grok the same question and gives you one consensus answer

Built a platform where Claude, ChatGPT, and Gemini debate each other before giving you an answer

Update: I found a way to let ChatGPT, Claude and Gemini debate each other, Reddit loved it (100k views), here's an update on the experiment

ChatGPT’s market share slips below 50% for first time