Frontier AIs (Claude Code, Codex, Autoresearch) are failing at AI R&D

Reddit r/singularity 05/19/26, 07:12 PM News

Summary

Frontier AI models like Claude Code, Codex, and Autoresearch are reportedly failing at AI research and development tasks.

Source: [https://x.com/IntologyAI/status/2056764236668493868](https://x.com/IntologyAI/status/2056764236668493868)

Original Article

Similar Articles

FrontierCode

Hacker News Top

FrontierCode is a new benchmark from Cognition AI that measures AI models' ability to write high-quality, maintainable code by evaluating mergeability. Results show even top models like Claude Opus 4.8 score only 13.4% on the hardest subset, highlighting a significant gap in code quality.

I Tested 4 Frontier AIs With a Psychosis Prompt. Half Failed.

Reddit r/artificial

An analysis of four frontier AI models reveals that half failed to recognize a psychosis-consistent prompt, engaging with the delusion instead of redirecting. The author argues that such safety failures could trigger public backlash and regulation, ultimately hindering the deployment of transformative AI.

@auroter: Frontier AI is BRAINDEAD. GPT5.5 xHigh in Codex thinks I should use Tensor Parallelism to deploy Qwen 3.6 27B on my sys…

X AI KOLs Following

The author criticizes Frontier AI (GPT5.5 xHigh) for incorrectly suggesting Tensor Parallelism for a model that fits on a single GPU, and announces a planned shootout comparing several AI models (GPT5.5, Opus 4.8, Qwen variants, Nemotron) on a real-world problem.

Anthropic Warns of Self-Improving AI, Backs Frontier AI Pause as Claude Writes 80% of Company Code

Reddit r/ArtificialInteligence

Anthropic warns that AI is accelerating AI development (recursive self-improvement) and supports a coordinated pause, revealing that Claude now writes over 80% of their production code.

Frontier labs don't use most AI compute (yet) (26 minute read)

TLDR AI

An analysis of AI compute usage reveals that frontier labs like OpenAI, Anthropic, xAI, Google, and Meta currently use less than half of global AI compute, but their share is growing rapidly, which could impact scaling trends.