DeepCode: Open Agentic Coding
Summary
DeepCode is a fully autonomous framework for document-to-codebase synthesis that uses principled information-flow management to convert scientific papers into production-grade code, achieving state-of-the-art results on PaperBench and surpassing PhD-level human experts.
View Cached Full Text
Cached at: 05/08/26, 12:27 PM
Paper page - DeepCode: Open Agentic Coding
Source: https://huggingface.co/papers/2512.07921 Published on Dec 8, 2025
·
Submitted byhttps://huggingface.co/taesiri
taesirion Dec 10, 2025
Abstract
DeepCode, a fully autonomous framework, addresses the challenges of document-to-codebase synthesis by optimizing information flow through source compression, structured indexing, knowledge injection, and error correction, achieving state-of-the-art performance and surpassing human experts.
Recent advances inlarge language models(LLMs) have given rise to powerfulcoding agents, making it possible for code assistants to evolve into code engineers. However, existing methods still face significant challenges in achieving high-fidelitydocument-to-codebase synthesis--such as scientific papers to code--primarily due to a fundamental conflict betweeninformation overloadand thecontext bottlenecksof LLMs. In this work, we introduceDeepCode, a fully autonomous framework that fundamentally addresses this challenge through principled information-flow management. By treating repository synthesis as achannel optimizationproblem,DeepCodeseamlessly orchestrates four information operations to maximize task-relevant signals under finite context budgets: source compression viablueprint distillation, structured indexing usingstateful code memory, conditional knowledge injection viaretrieval-augmented generation, andclosed-loop error correction. Extensive evaluations on thePaperBenchbenchmark demonstrate thatDeepCodeachieves state-of-the-art performance, decisively outperforming leading commercial agents such as Cursor and Claude Code, and crucially, surpassing PhD-level human experts from top institutes on key reproduction metrics. By systematically transforming paper specifications into production-grade implementations comparable to human expert quality, this work establishes new foundations forautonomous scientific reproductionthat can accelerate research evaluation and discovery.
View arXiv pageView PDFGitHub15.4kAdd to collection
Get this paper in your agent:
hf papers read 2512\.07921
Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2512.07921 in a model README.md to link it from this page.
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2512.07921 in a dataset README.md to link it from this page.
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2512.07921 in a Space README.md to link it from this page.
Collections including this paper18
Similar Articles
CodeAlchemy: Synthetic Code Rewriting at Scale
CodeAlchemy is a synthetic data generation framework that transforms publicly available code into semantically rich training data using five strategies, producing over 500 billion tokens and enabling small models to outperform much larger ones on code benchmarks.
Building Decypher: An Execution Context Engine for Agents
Decypher is a tool that provides deep execution context for agentic coding by performing semantic analysis of codebases, enabling agents to answer queries about code structure and interactions.
@RealCodedAlpha: https://x.com/RealCodedAlpha/status/2064921935507837260
An in-depth article on mastering OpenAI Codex, covering a complete knowledge system from mental models to practical applications such as large-scale code migration, security auditing, performance optimization, team collaboration, building a personal AI operating system, and product development.
Harness engineering: leveraging Codex in an agent-first world
OpenAI describes an internal experiment building a production software product with zero manually-written code using Codex agents, achieving approximately 10x faster development with 1.5M lines of code written by AI in five months. The team learned that effective agent-driven development requires engineers to focus on systems design, scaffolding, and feedback loops rather than writing code directly.
Codex-maxxing
Jason Liu shares how he uses OpenAI's Codex for knowledge work beyond coding, leveraging durable threads, voice input, and steering to integrate coding agents into his broader workflow.