Tag
DeNovoSWE is a large-scale dataset for training code agents to generate entire software repositories from documentation, using a sandboxed agentic workflow and difficulty-aware filtering. Fine-tuning Qwen3-30B-A3B on it boosts performance on the BeyondSWE-Doc2Repo benchmark from 5.8% to 47.2%.