Tag
LEVI is an open-source AlphaEvolve-like system that runs locally on Qwen3-30B, offering code and prompt optimization with up to 35x cost reduction and better performance than existing frameworks.
Anthropic shares internal benchmark results showing dramatic AI coding improvement: while Claude Opus 4 averaged ~3x speedup on an ML code optimization task in May 2024, the new Mythos Preview model achieved ~52x speedup this April, compared to 4-8 hours for a skilled human to reach 4x.
MIT HAN Lab proposes a method to automatically design and optimize CUDA kernels using an AI agent workflow. Through a process of task contracts, agent loops, and small-step verification, the agent can autonomously iterate and optimize within a specialized toolchain, replacing manual tuning.
Evo is an open-source tool that provides semi-autonomous agents to optimize codebases through parallel experimentation, using tree search and multiple subagents to autonomously discover and improve metrics.
Researchers from Carnegie Mellon, University of Washington, and Arm propose AdaExplore, an LLM agent framework for GPU kernel code generation that achieves 3.12× and 1.72× speedups on KernelBench Level-2 and Level-3 benchmarks through failure-driven adaptation and diversity-preserving search, without additional fine-tuning.