DeepCode: Open Agentic Coding

Papers with Code Trending Papers

Summary

DeepCode is a fully autonomous framework for document-to-codebase synthesis that uses principled information-flow management to convert scientific papers into production-grade code, achieving state-of-the-art results on PaperBench and surpassing PhD-level human experts.

Recent advances in large language models (LLMs) have given rise to powerful coding agents, making it possible for code assistants to evolve into code engineers. However, existing methods still face significant challenges in achieving high-fidelity document-to-codebase synthesis--such as scientific papers to code--primarily due to a fundamental conflict between information overload and the context bottlenecks of LLMs. In this work, we introduce DeepCode, a fully autonomous framework that fundamentally addresses this challenge through principled information-flow management. By treating repository synthesis as a channel optimization problem, DeepCode seamlessly orchestrates four information operations to maximize task-relevant signals under finite context budgets: source compression via blueprint distillation, structured indexing using stateful code memory, conditional knowledge injection via retrieval-augmented generation, and closed-loop error correction. Extensive evaluations on the PaperBench benchmark demonstrate that DeepCode achieves state-of-the-art performance, decisively outperforming leading commercial agents such as Cursor and Claude Code, and crucially, surpassing PhD-level human experts from top institutes on key reproduction metrics. By systematically transforming paper specifications into production-grade implementations comparable to human expert quality, this work establishes new foundations for autonomous scientific reproduction that can accelerate research evaluation and discovery.
Original Article
View Cached Full Text

Cached at: 05/08/26, 12:27 PM

Paper page - DeepCode: Open Agentic Coding

Source: https://huggingface.co/papers/2512.07921 Published on Dec 8, 2025

·

Submitted byhttps://huggingface.co/taesiri

taesirion Dec 10, 2025

Abstract

DeepCode, a fully autonomous framework, addresses the challenges of document-to-codebase synthesis by optimizing information flow through source compression, structured indexing, knowledge injection, and error correction, achieving state-of-the-art performance and surpassing human experts.

Recent advances inlarge language models(LLMs) have given rise to powerfulcoding agents, making it possible for code assistants to evolve into code engineers. However, existing methods still face significant challenges in achieving high-fidelitydocument-to-codebase synthesis--such as scientific papers to code--primarily due to a fundamental conflict betweeninformation overloadand thecontext bottlenecksof LLMs. In this work, we introduceDeepCode, a fully autonomous framework that fundamentally addresses this challenge through principled information-flow management. By treating repository synthesis as achannel optimizationproblem,DeepCodeseamlessly orchestrates four information operations to maximize task-relevant signals under finite context budgets: source compression viablueprint distillation, structured indexing usingstateful code memory, conditional knowledge injection viaretrieval-augmented generation, andclosed-loop error correction. Extensive evaluations on thePaperBenchbenchmark demonstrate thatDeepCodeachieves state-of-the-art performance, decisively outperforming leading commercial agents such as Cursor and Claude Code, and crucially, surpassing PhD-level human experts from top institutes on key reproduction metrics. By systematically transforming paper specifications into production-grade implementations comparable to human expert quality, this work establishes new foundations forautonomous scientific reproductionthat can accelerate research evaluation and discovery.

View arXiv pageView PDFGitHub15.4kAdd to collection

Get this paper in your agent:

hf papers read 2512\.07921

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2512.07921 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2512.07921 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2512.07921 in a Space README.md to link it from this page.

Collections including this paper18

Browse 18 collections that include this paper

Similar Articles

CodeAlchemy: Synthetic Code Rewriting at Scale

arXiv cs.CL

CodeAlchemy is a synthetic data generation framework that transforms publicly available code into semantically rich training data using five strategies, producing over 500 billion tokens and enabling small models to outperform much larger ones on code benchmarks.

@RealCodedAlpha: https://x.com/RealCodedAlpha/status/2064921935507837260

X AI KOLs Timeline

An in-depth article on mastering OpenAI Codex, covering a complete knowledge system from mental models to practical applications such as large-scale code migration, security auditing, performance optimization, team collaboration, building a personal AI operating system, and product development.

Harness engineering: leveraging Codex in an agent-first world

OpenAI Blog

OpenAI describes an internal experiment building a production software product with zero manually-written code using Codex agents, achieving approximately 10x faster development with 1.5M lines of code written by AI in five months. The team learned that effective agent-driven development requires engineers to focus on systems design, scaffolding, and feedback loops rather than writing code directly.

Codex-maxxing

Hacker News Top

Jason Liu shares how he uses OpenAI's Codex for knowledge work beyond coding, leveraging durable threads, voice input, and steering to integrate coding agents into his broader workflow.