Tag
Recursive releases an automated AI research system that achieves state-of-the-art results on three benchmarks: fixed-budget language model training, small-model training speed, and GPU kernel optimization. The system automates the research loop and open-sources artifacts from its runs.
Josh Tobin teases Recursive_SI's automated researchers, showing early demos of performance optimization capabilities.
Recursive's automated AI research system achieves state-of-the-art results on NanoChat, NanoGPT Speedrun, and GPU kernel benchmarks by automating the research loop without task-specific adaptations, and open-sourcing artifacts for further inspection.
Recursive releases early results from its automated AI research system, achieving state-of-the-art in fixed-budget language model training, small-model training speed, and GPU kernel optimization, and open-sources artifacts.
Harvard University's AutoScientists proposes a decentralized multi-agent team approach, allowing multiple agents to share experimental status, automatically form teams, and review research plans, significantly outperforming existing methods on multiple benchmarks.
A new paper from Meta, Stanford, and Google introduces AutoResearchClaw, which improves automated research by integrating failure recovery, debate, and selective human input. It outperforms AI Scientist v2 by 54.7% on ARC-Bench and reveals that autonomy is enhanced when constrained by process rather than given unlimited freedom.
SciAtlas is a large-scale, multi-disciplinary academic knowledge graph containing over 43 million papers and 3 billion triplets, designed to provide structured knowledge for AI-driven automated scientific research with a neuro-symbolic retrieval algorithm.
Professor Jie Ding open-sourced Autoresearch and WorldSeed, AI agent frameworks capable of autonomously reviewing 72 peer-reviewed papers overnight to address a research problem.
New AI lab focused on automated research founded by Jerry Tworek, former OpenAI VP of Research, with a strong team from OAI, Anthropic, and DeepMind.
Anthropic Fellows research demonstrates an experiment using Claude Opus 4.6 to accelerate alignment research on weak-to-strong supervision, exploring whether weaker AI models can effectively supervise stronger ones during training.
DeepLearning.AI launches 'Build with Andrew,' a course enabling non-coders to build web applications using AI in under 30 minutes, while research addresses LLM transparency issues including model honesty and automated scientific research capabilities.
Anthropic researchers demonstrate that Claude Opus 4.6 can autonomously act as an alignment researcher to improve weak-to-strong supervision techniques, addressing challenges in scalable oversight.