Tag
This paper proposes a method for autonomous research agents using hypothesis-tree refinement to generate and test hypotheses, aiming toward generalist scientific discovery.
AutoSci is a memory-centric agentic system designed to automate the full scientific research lifecycle, from literature understanding to rebuttal, using LLM-based agents with persistent memory and self-evolution capabilities.
The open-source AI agent Feynman, through the collaboration of four intelligent agents, compresses PhD-level research processes (including arXiv research, literature review, code verification) into fully automated execution, requiring only a single instruction from the user.
A system built on Claude Code allows it to control Google's NotebookLM from the terminal, automating research by searching YouTube, uploading sources, and exporting cited answers directly into Obsidian. This workflow eliminates the need for multiple browser tabs and manual copying, with verified citation accuracy.
Hugging Face open-sourced ml-intern, an autonomous agent that performs the entire ML post-training loop—reading papers, finding datasets, writing scripts, generating data, monitoring training, and uploading weights—achieving significant GPQA improvement with a 1.7B model in 10 hours without human intervention.
A survey paper examining the transition of AI from task-specific assistants to workflow-level research automators, defining AutoResearch as the spectrum of AI-powered scientific workflow automation and analyzing challenges in autonomy, reproducibility, and accountability.
ARIS is an open-source tool that has gone viral on GitHub (8.8k stars). It uses a lightweight Markdown skill pack to enable Claude Code or other LLM agents to autonomously complete the entire machine learning research lifecycle, including literature review, experiment execution, and paper writing.
NanoResearch is a multi-agent framework designed to personalize research automation by co-evolving skills, memory, and policy to adapt to individual user preferences and research styles.
EvoScientist is an open-source framework that automates research workflows using self-evolving AI scientists with persistent multi-agent memory, adopting a human-on-the-loop paradigm for autonomous research exploration and insight generation.
The article explains how Manus's Browser Operator works by operating inside the user's authorized local browser session, allowing it to access subscription-based and authenticated content beyond typical AI search capabilities, and provides a step-by-step guide for enabling and using it.
Hugging Face replaced its post-training team with an autonomous agent that reads papers, runs GPU experiments, and improves models, achieving a 22-point benchmark jump in under 10 hours and beating Codex on HealthBench by 60%.
Andrej Karpathy's autoresearch pattern highlights how current AI agents run experiments in isolation, wasting compute by duplicating work and rediscovering dead ends.
The article argues that there is a high likelihood (60%+) of fully automated AI R&D—where AI systems can build their own successors without human involvement—by the end of 2028, citing evidence from coding benchmarks like SWE-Bench and trends in AI autonomy.