AutoDev: Automated AI-Driven Development
Summary
AutoDev is an AI-driven software development framework that automates complex engineering tasks, such as code and test generation, within a secure Docker environment. It achieves high performance on the HumanEval dataset by enabling autonomous planning and execution of intricate software engineering tasks.
View Cached Full Text
Cached at: 05/08/26, 08:45 AM
Paper page - AutoDev: Automated AI-Driven Development
Source: https://huggingface.co/papers/2403.08299 Published on Mar 13, 2024
Abstract
AutoDev is an AI-driven software development framework that automates complex engineering tasks within a secure Docker environment, achieving high performance in code and test generation.
The landscape of software development has witnessed a paradigm shift with the advent of AI-powered assistants, exemplified by GitHub Copilot. However, existing solutions are not leveraging all the potential capabilities available in an IDE such as building, testing, executing code, git operations, etc. Therefore, they are constrained by their limited capabilities, primarily focusing on suggesting code snippets and file manipulation within a chat-based interface. To fill this gap, we present AutoDev, a fully automated AI-driven software development framework, designed for autonomous planning and execution of intricate software engineering tasks. AutoDev enables users to define complex software engineering objectives, which are assigned to AutoDev’sautonomous AI Agentsto achieve. These AI agents can perform diverse operations on a codebase, including file editing, retrieval, build processes, execution, testing, and git operations. They also have access to files, compiler output, build and testing logs, static analysis tools, and more. This enables the AI Agents to execute tasks in a fully automated manner with a comprehensive understanding of the contextual information required. Furthermore, AutoDev establishes a secure development environment by confining all operations withinDocker containers. This framework incorporates guardrails to ensure user privacy and file security, allowing users to define specific permitted or restricted commands and operations within AutoDev. In our evaluation, we tested AutoDev on theHumanEval dataset, obtaining promising results with 91.5% and 87.8% ofPass@1forcode generationandtest generationrespectively, demonstrating its effectiveness in automating software engineering tasks while maintaining a secure and user-controlled development environment.
View arXiv pageView PDFGitHub16.6kautoAdd to collection
Get this paper in your agent:
hf papers read 2403\.08299
Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2403.08299 in a model README.md to link it from this page.
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2403.08299 in a dataset README.md to link it from this page.
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2403.08299 in a Space README.md to link it from this page.
Collections including this paper4
Similar Articles
AutoScout24 scales engineering with AI-powered workflows
AutoScout24 implemented OpenAI's Codex and ChatGPT across its engineering and broader workforce, significantly reducing development timelines and improving code quality.
@tom_doerr: Turns AI coding chats into a repeatable engineering workflow https://github.com/codeaholicguy/ai-devkit…
AI DevKit is a tool that adds a structured engineering workflow to AI coding agents, incorporating requirements, design, verification, and memory to replace one-off chats with repeatable processes.
My Homelab AI Dev Platform
The author describes setting up an AI dev platform in his homelab using OpenCode Web UI with Git access, enabling AI-assisted maintenance of Docker services via PR review and GitOps deployment.
OpenDevin: An Open Platform for AI Software Developers as Generalist Agents
OpenDevin is an open-source platform for developing AI agents that can write code, use command lines, and browse the web to interact with the environment. It supports multiple agents, sandboxed code execution, and evaluation benchmarks like SWE-Bench.
@tom_doerr: Runs a virtual company with 14 expert AI agents https://github.com/MaxMiksa/Auto-Company…
Auto-Company is an open-source project that orchestrates 14 autonomous AI agents to run a fully automated company 24/7, handling ideation, coding, deployment, and marketing without human intervention, powered by Claude Code and Codex CLI.