@eliebakouch: will be speaking about automated AI research at @aiDotEngineer we believe frontier models need the right environments t…
Summary
The author will speak at aiDotEngineer about using speedruns like nanogpt to evaluate AI research capabilities.
View Cached Full Text
Cached at: 06/25/26, 07:13 AM
will be speaking about automated AI research at @aiDotEngineer
we believe frontier models need the right environments to evolve and do research, we found that speedruns like nanogpt by @karpathy @kellerjordan0 et al are a simple but very effective setup for evaluating research capabilities, and potentially for training and discovery as well
oh and i should (hopefully) have some new interesting results to show, super excited for this!
Similar Articles
@ChengleiSi: Excited to share these preliminary results on our internal autoresearch system @Recursive_SI, where we achieve SOTA on …
Recursive's automated AI research system achieves state-of-the-art results on NanoChat, NanoGPT Speedrun, and GPU kernel benchmarks by automating the research loop without task-specific adaptations, and open-sourcing artifacts for further inspection.
@IntologyAI: Can coding agents do research? We release NanoGPT-Bench, an internal eval we’ve used to test agents on an AI R&D proble…
IntologyAI releases NanoGPT-Bench, an internal benchmark to evaluate coding agents on AI R&D tasks. Current agents recover only 9.3% of human progress, mostly through hyperparameter tuning, highlighting gaps in algorithmic research capabilities.
The AI war is moving from models to machines and I don’t think enough people are talking about it
A commentary arguing that the AI competition is shifting from model quality to hardware placement and infrastructure, highlighting Microsoft's Project Solara, NVIDIA's RTX Spark, and ByteDance's custom CPU efforts as signs that agentic workloads are driving new silicon and deployment strategies.
@stanfordnlp: Lots of @stanfordnlp work at @icmlconf. See you in Seoul! Towards Execution-Grounded Automated AI Research @ChengleiSi …
This paper investigates execution-grounded automated AI research by building an automated executor that implements LLM-generated ideas and runs experiments. It shows that execution-guided evolutionary search can find methods that significantly outperform baselines in both pre-training and post-training tasks.
@lillian_ma_: Emerging autoresearch labs worth following: @AutoScienceAI (@eliot_cowan) One of the cleanest “AI builds AI” bets: agen…
A Twitter thread highlights emerging autoresearch labs that are building AI systems to automate the full research loop, from hypothesis to experimentation.