@yibie: awesome-autoresearch updated, added 3 entries. dreamworld — world model research. Applied the autoresearch loop to pixel-level world model training (CarRacing-v3), where the agent can perform keep/discard experiments autonomously in tokenizer, dy…
Summary
awesome-autoresearch updated, added dreamworld (world model research), Odyssey Engine (general iterative engine), and an article by Kirill Krainov on self-improvement of agentic coding.
View Cached Full Text
Cached at: 05/24/26, 06:36 PM
awesome-autoresearch updated, 3 new entries added.
-
dreamworld — World model research. Applying the autoresearch loop to pixel-level world model training (CarRacing-v3); the agent can autonomously perform keep/discard experiments in three stages: tokenizer, dynamics, and planning, with remote GPU orchestration via Modal. Inspired by AutoGo and Karpathy’s original loop.
-
Odyssey Engine — A general-purpose iteration engine that combines three patterns: autoresearch (metric-driven experiments), gaggle-iterate (checkpointing/validation/rollback), and ralph-loop (continuous execution). It provides three strategy configurations for Claude Code: engineer, creative, and production. The same loop can run code optimization, writing refinement, and research exploration.
-
Kirill Krainov at http://zerocopy.blog describes in detail how to port autoresearch’s multi-metric evaluation (correctness, clarity, performance, completeness) to self-improvement of agentic coding skills, applying Karpathy’s stateless keep/discard pattern to programming skill enhancement.
239 implementations + 116 discussions = 355 entries http://github.com/yibie/awesome-autoresearch…
Kirill Krainov
Source: https://zerocopy.blog/ Introduction I’ve been testing Ralph loops recently for agentic coding. The idea is simple: spin up new Claude Code sessions for each task to get a fresh context until the…Read more: Autonomous AI Coding: Ralph Loops with Sub-Agents and Skills (Pt. 1) (https://zerocopy.blog/2026/03/17/autonomous-ai-coding-ralph-loops-with-sub-agents-and-skills-pt-1/)
Similar Articles
@yibie: awesome-autoresearch updated, added 6 entries. Trace2Evolve — applying autoresearch to the self-evolution of customer service agents. Automatically generate difficult cases, score traces, classify failure reasons, only retain improvements when both benchmark and reliability gate pass...
awesome-autoresearch list updated, adding 6 application cases based on Karpathy's autoresearch pattern, covering scenarios such as customer service agent self-evolution, Shell integration, code configuration self-optimization, RAG tuning, and ASO.
@yibie: awesome-autoresearch updated, adding 7 entries. pi-lifeline — a very practical pi extension: let local small models run autoresearch, automatically ask stronger models for help when stuck. Tobi Lütke also mentioned this on X...
awesome-autoresearch list updated, adding 7 entries, including pi-lifeline (local small model asks strong model for help), Marketing Mix Modeling 12x improvement, TokenTelemetry open-source tracking tool, etc., covering autoresearch application cases across multiple industries.
@yibie: This week awesome-autoresearch added 3 items: 1. autoslam: Applying Karpathy's autoresearch loop to LiDAR SLAM method design, accumulating experimental leaderboard on KITTI benchmark 2. Bir…
This week awesome-autoresearch added three items, including the autoslam project that applies Karpathy's autoresearch loop to LiDAR SLAM, and two blog posts analyzing the original experiments and revealing metric gaming.
@WWTLitee: Is there a way for AI to autonomously iterate and optimize? Yes, check out autoresearch. Its core isn't to have AI directly 'invent papers,' but to break the research process into a verifiable loop: humans write program.md to give research direction, AI agent modifies http://tra…
Introduces the autoresearch project, which breaks down the AI research process into a verifiable loop (fixed environment, single editable file, fixed metric, Git rollback), enabling AI agents to perform controllable and reproducible experiment iterations; also mentions the 12-factor-agents checklist.
@yibie: This week's autoresearch ecosystem evidence scan: 9 new records, total count 383. AutoResearch-RL: A continuous RL research framework with http://prepare.py/train.py isolation, supporting LLM/hybrid strategy experiment scheduling l…
This week, 9 new records were added to the autoresearch ecosystem, bringing the total to 383, covering multiple open-source tools and projects such as the AutoResearch-RL reinforcement learning framework, lance-autoresearch database kernel optimization, and Clio prediction market backtesting framework.