autonomous-research

Tag

Cards List
#autonomous-research

@yibie: Recommend this article. The author of Superpowers ran a complete autoresearch loop with Fable 5 — 25 experiments, $165, improving build speed by 50% and reducing token costs by 60%. But the most valuable part of this article is not the result numbers; it's the complete record of the process…

X AI KOLs Timeline · 18h ago Cached

Superpowers 6 is released, using Fable 5 to run 25 autonomous experiments, improving build speed by 50% and reducing token costs by 60%, with detailed records of the experimental process and lessons from failures.

0 favorites 0 likes
#autonomous-research

Autonomous discovery of traffic laws with AI traffic scientists

arXiv cs.AI · 23h ago Cached

This paper presents TrafficSci, an agentic AI system that automates the discovery of universal traffic laws across cities through iterative workflows, successfully rediscovering established laws and identifying a new temporal memory scale in urban driving behavior.

0 favorites 0 likes
#autonomous-research

@VraserX: OpenAI’s AI research intern, coming around September, feels like early AGI to me. Not because it’s some magic super bra…

X AI KOLs Timeline · 3d ago Cached

A tweet speculates that OpenAI's upcoming AI research intern (September) feels like early AGI, and predicts a fully autonomous AI researcher by 2027-2028, which could be the first ASI.

0 favorites 0 likes
#autonomous-research

@VukRosic99: Build LLM in 1 Prompt + Setup Autoresearch By DeepSeek Researcher (his side project) A live build where you create a fu…

X AI KOLs Timeline · 5d ago Cached

Demonstrates building a full LLM using a single prompt to an AI coding agent (Claude Code/Codex) and installing an autonomous AI research skill by a DeepSeek researcher, covering architecture, failure modes, and unattended operation.

0 favorites 0 likes
#autonomous-research

@VukRosic99: A DeepSeek researcher just open-sourced his AutoResearch personal project. For the first time, the AutoResearch Agent a…

X AI KOLs Timeline · 2026-06-18 Cached

A DeepSeek researcher open-sourced AutoResearch, an autonomous framework that can plan, execute, and debug RL experiments on the DeepSeek 285B model without human intervention, accompanied by a self-play survey paper.

0 favorites 0 likes
#autonomous-research

@OpenAI: GPT-5.4 helped drive a medicinal chemistry project from literature review to a validated experimental result. Paired wi…

X AI KOLs · 2026-06-17 Cached

GPT-5.4, in collaboration with Molecule.one's Maria AI platform, autonomously drove a medicinal chemistry project from literature review to validated experimental result, proposing an unexpected improvement to a widely used reaction in drug discovery.

0 favorites 0 likes
#autonomous-research

@victor207755822: Deli AutoResearch SKILL is now officially open source! https://victorchen96.github.io/auto_research/framework.html… Alo…

X AI KOLs Timeline · 2026-06-17 Cached

Deli AutoResearch SKILL is open-sourced, an autonomous framework that automates GPU experiments and RL pipelines, with a companion survey paper on Self-play.

0 favorites 0 likes
#autonomous-research

Sakana Marlin (4 minute read)

TLDR AI · 2026-06-16 Cached

Sakana AI launches its first commercial product, Sakana Marlin, an autonomous research assistant that completes strategy work in hours by generating structured slides and detailed reports.

0 favorites 0 likes
#autonomous-research

@THUTeamEureka: 1/3 Excited to open-source EurekAgent! A fully autonomous research system for metric-driven tasks, built with Claude Co…

X AI KOLs Timeline · 2026-06-15 Cached

THU Team Eureka open-sources EurekAgent, an autonomous research system built with Claude Code that achieves state-of-the-art results on math, kernel engineering, and ML tasks through environment engineering.

0 favorites 0 likes
#autonomous-research

@_akhaliq: paper:

X AI KOLs Following · 2026-06-11 Cached

A paper introducing Arbor, an AI framework that enables autonomous scientific research by combining strategic coordination, isolated hypothesis testing, and a persistent knowledge tree to iteratively improve research outcomes across multiple domains.

0 favorites 0 likes
#autonomous-research

@_akhaliq: Toward Generalist Autonomous Research via Hypothesis-Tree Refinement

X AI KOLs Following · 2026-06-11 Cached

This paper proposes a method for autonomous research agents using hypothesis-tree refinement to generate and test hypotheses, aiming toward generalist scientific discovery.

0 favorites 0 likes
#autonomous-research

Toward Generalist Autonomous Research via Hypothesis-Tree Refinement

Hugging Face Daily Papers · 2026-06-10 Cached

Arbor is an AI framework for autonomous scientific research that uses a coordinator, executors, and a persistent hypothesis tree to iteratively improve research outcomes across multiple domains, achieving strong results on six real research tasks.

0 favorites 0 likes
#autonomous-research

ResearchClawBench: A Benchmark for End-to-End Autonomous Scientific Research

arXiv cs.LG · 2026-06-09 Cached

ResearchClawBench is a benchmark for evaluating end-to-end autonomous scientific research across 40 tasks from 10 domains, using expert-curated rubrics. Current systems score poorly, highlighting challenges in achieving reliable autonomous scientific discovery.

0 favorites 0 likes
#autonomous-research

Autonomous heterogeneous catalyst discovery with a self-evolving multi-agent digital twin

arXiv cs.AI · 2026-06-08 Cached

This paper presents CatDT, a self-evolving multi-agent digital twin that autonomously predicts heterogeneous catalyst properties from bulk crystal and reaction description, achieving experimental accuracy across seven benchmarks and discovering non-precious catalyst candidates for propane dehydrogenation.

0 favorites 0 likes
#autonomous-research

@dair_ai: Outstanding paper on long-horizon agents. (bookmark it) Similar to humans, how do you make agents persist on a difficul…

X AI KOLs Following · 2026-06-04 Cached

AutoLab is a new benchmark evaluating 17 frontier models on 36 expert-curated long-horizon tasks (system optimization, model development, CUDA kernels, puzzles), finding that persistence—not initial attempt quality—is the dominant predictor of success. Claude-opus-4.6 led all categories, while most other models terminated prematurely or exhausted budgets with minimal progress.

0 favorites 0 likes
#autonomous-research

AutoMedBench: Towards Medical AutoResearch with Agentic AI Models

Hugging Face Daily Papers · 2026-06-01 Cached

AutoMedBench is a workflow-aware benchmark for autonomous medical-AI research, evaluating agents across five stages on diverse medical imaging tasks. Stage-level scoring reveals validation as the weakest stage, highlighting the need for reliable verification in agentic workflows.

0 favorites 0 likes
#autonomous-research

ResearchClawBench: A Benchmark for End-to-End Autonomous Scientific Research

Hugging Face Daily Papers · 2026-05-28 Cached

ResearchClawBench is a benchmark for evaluating end-to-end autonomous scientific research across 40 tasks from 10 domains, revealing that current AI agents and LLMs achieve low re-discovery accuracy, with Claude Code averaging 21.5 and Claude-Opus-4.7 averaging 20.7 out of a possible score.

0 favorites 0 likes
#autonomous-research

ScientistOne: Towards Human-Level Autonomous Research via Chain-of-Evidence

arXiv cs.AI · 2026-05-27 Cached

ScientistOne introduces Chain-of-Evidence, a verifiability framework for autonomous research agents that ensures every claim is traceable to evidence, achieving zero hallucinated references, perfect score verification, and the highest method-code alignment across 75 papers while matching or exceeding human expert performance on frontier research tasks.

0 favorites 0 likes
#autonomous-research

@RodmanAi: Holy shit. Someone just open-sourced a financial brain. It’s called Dexter. → Finds undervalued stocks → Breaks down en…

X AI KOLs Timeline · 2026-05-22 Cached

Dexter is an open-source autonomous financial research agent that analyzes stocks and builds investment theses using real-time data, task planning, and self-reflection.

0 favorites 0 likes
#autonomous-research

@sitinme: Saw Karpathy open-sourced a very interesting project autoresearch, which gives a real but small-scale LLM training task to an AI Agent, letting it do research, modify code, run experiments, look at results, and then decide whether to keep or discard the changes. The project is based on a single NVIDIA…

X AI KOLs Timeline · 2026-05-21 Cached

Karpathy open-sourced an experimental project, autoresearch, that lets an AI Agent automatically complete the research loop for small-scale LLM training: modify code, run experiments, evaluate results, and iterate. Humans only need to write the research plan and constraints.

0 favorites 0 likes
Next →
← Back to home

Submit Feedback