OpenThoughts-Agent: Data Recipes for Agentic Models

Hugging Face Daily Papers 06/23/26, 12:00 AM Papers

open-source data-curation agentic-models training-data fine-tuning scaling benchmarks

Summary

This paper introduces OpenThoughts-Agent, an open-source data curation pipeline for training agentic language models, achieving a 44.8% average accuracy across seven benchmarks and outperforming prior open datasets through systematic experiments.

Agentic language models dramatically expand the applications of AI yet little is publicly known about how to curate training data for broadly capable agents. Existing open efforts such as SWE-Smith, SERA, and Nemotron-Terminal typically target a single benchmark, leaving open the question of how to train models that generalize across diverse agentic tasks. The OpenThoughts-Agent (OT-Agent) project addresses this gap with a fully open data curation pipeline for training agentic models. We conduct more than 100 controlled ablation experiments to systematically investigate each stage of the pipeline, yielding insights on the importance of task sources and diversity. We then assemble a training set of 100K examples from our pipeline and fine-tune Qwen3-32B on this dataset, which yields an average accuracy of 44.8% across seven agentic benchmarks and a 3.9 percentage point improvement over the strongest existing open data agentic model (Nemotron-Terminal-32B, 40.9%). Moreover, our training data exhibits strong scaling properties, outperforming alternative open datasets at every training set size in compute-controlled comparisons. We publicly release our training sets, data pipeline, experimental data, and models at openthoughts.ai to support future open research on agentic model training.

Original Article

View Cached Full Text

Cached at: 06/24/26, 05:47 AM

Paper page - OpenThoughts-Agent: Data Recipes for Agentic Models

Source: https://huggingface.co/papers/2606.24855 Authors:

Abstract

An open-source data curation pipeline for training agentic language models is presented, demonstrating superior performance through systematic experimentation and scalable training data.

Agentic language modelsdramatically expand the applications of AI yet little is publicly known about how to curatetraining datafor broadly capable agents. Existing open efforts such as SWE-Smith, SERA, and Nemotron-Terminal typically target a single benchmark, leaving open the question of how to train models that generalize across diverse agentic tasks. The OpenThoughts-Agent (OT-Agent) project addresses this gap with a fully opendata curation pipelinefor training agentic models. We conduct more than 100controlled ablation experimentsto systematically investigate each stage of the pipeline, yielding insights on the importance of task sources and diversity. We then assemble a training set of 100K examples from our pipeline andfine-tuneQwen3-32B on this dataset, which yields an average accuracy of 44.8% across seven agenticbenchmarksand a 3.9 percentage point improvement over the strongest existing open data agentic model (Nemotron-Terminal-32B, 40.9%). Moreover, ourtraining dataexhibits strongscaling properties, outperforming alternative open datasets at every training set size in compute-controlled comparisons. We publicly release our training sets, data pipeline, experimental data, and models at openthoughts.ai to support future open research on agentic model training.

View arXiv page View PDF Add to collection

Get this paper in your agent:

hf papers read 2606\.24855

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2606.24855 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2606.24855 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2606.24855 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

OpenThoughts-Agent: Data Recipes for Agentic Models

Paper page - OpenThoughts-Agent: Data Recipes for Agentic Models

Abstract

Models citing this paper0

Datasets citing this paper0

Spaces citing this paper0

Collections including this paper0

Similar Articles

Is it agentic enough? Benchmarking open models on your own tooling

Neurodata Without Boredom: Benchmarking Agentic AI for Data Reuse

@omarsar0: Karpathy's autoresearch repo started an impressive trend. Agents can now train AI models to build SoTA agentic systems.…

OpenSearch-VL: An Open Recipe for Frontier Multimodal Search Agents

Experiments in Agentic AI for Science

Submit Feedback

Similar Articles

Is it agentic enough? Benchmarking open models on your own tooling

Neurodata Without Boredom: Benchmarking Agentic AI for Data Reuse

@omarsar0: Karpathy's autoresearch repo started an impressive trend. Agents can now train AI models to build SoTA agentic systems.…

OpenSearch-VL: An Open Recipe for Frontier Multimodal Search Agents

Experiments in Agentic AI for Science