LiteCoder-Terminal: Scaling Long-Horizon Terminal Environments for Learning Language Agents

Hugging Face Daily Papers 05/28/26, 12:00 AM Papers

Summary

LiteCoder-Terminal-Gen introduces a zero-dependency synthetic pipeline that generates executable terminal training environments, producing SFT and RL datasets that enable language agents to achieve significant performance gains on Terminal Bench benchmarks.

Mastering terminal environments requires language agents capable of multi-step planning, feedback-grounded execution, and dynamic state adaptation. However, training such agents is currently bottlenecked by a reliance on scraped external repositories, which limits domain diversity, environment controllability, and the targeting of specific capability deficits. We introduce LiteCoder-Terminal-Gen, a zero-dependency synthesis pipeline that autonomously generates executable and verifiable terminal training environments directly from domain specifications. Using this framework, we construct two large-scale resources: LiteCoder-Terminal-SFT, comprising 11,255 expert trajectories across 10 domains, and LiteCoder-Terminal-RL, featuring 602 verifiable environments for trajectory-level preference optimization. Supervised fine-tuning of Qwen-family models on our SFT dataset yields agents that significantly outperform their base counterparts. Notably, our 32B variant achieves 29.06%, 18.54%, and 34.00% pass@1 on Terminal Bench 1.0, 2.0, and Pro, respectively. Furthermore, applying Direct Multi-turn Preference Optimization (DMPO) on our RL environments yields additional performance gains. These results systematically demonstrate that fully synthetic, executable environments offer a scalable and verifiable supervision signal for mastering complex, real-world command-line workflows.

Original Article

View Cached Full Text

Cached at: 05/29/26, 02:59 AM

Paper page - LiteCoder-Terminal: Scaling Long-Horizon Terminal Environments for Learning Language Agents

Source: https://huggingface.co/papers/2605.29559

Abstract

LiteCoder-Terminal-Gen enables scalable training of language agents for terminal environments through synthetic, executable environments that outperform traditional methods.

Masteringterminal environmentsrequireslanguage agentscapable ofmulti-step planning,feedback-grounded execution, anddynamic state adaptation. However, training such agents is currently bottlenecked by a reliance on scraped external repositories, which limits domain diversity, environment controllability, and the targeting of specific capability deficits. We introduce LiteCoder-Terminal-Gen, azero-dependency synthesis pipelinethat autonomously generates executable and verifiable terminal training environments directly from domain specifications. Using this framework, we construct two large-scale resources: LiteCoder-Terminal-SFT, comprising 11,255expert trajectoriesacross 10 domains, and LiteCoder-Terminal-RL, featuring 602 verifiable environments fortrajectory-level preference optimization.Supervised fine-tuningof Qwen-family models on our SFT dataset yields agents that significantly outperform their base counterparts. Notably, our 32B variant achieves 29.06%, 18.54%, and 34.00% pass@1 on Terminal Bench 1.0, 2.0, and Pro, respectively. Furthermore, applyingDirect Multi-turn Preference Optimization(DMPO) on our RL environments yields additional performance gains. These results systematically demonstrate that fully synthetic, executable environments offer a scalable and verifiable supervision signal for mastering complex, real-worldcommand-line workflows.

View arXiv page View PDF GitHub Add to collection

Get this paper in your agent:

hf papers read 2605\.29559

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper2

#### Lite-Coder/LiteCoder-Terminal-4b-sft 4B• Updatedabout 1 hour ago • 125 #### Lite-Coder/LiteCoder-Terminal-30b-a3b-sft 31B• Updatedabout 1 hour ago • 40

Datasets citing this paper2

#### Lite-Coder/LiteCoder-Terminal-RL-preview Updatedabout 1 hour ago • 1.23k • 3 #### Lite-Coder/LiteCoder-Terminal-SFT Updatedabout 1 hour ago • 363 • 1

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2605.29559 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

LiteCoder-Terminal: Scaling Long-Horizon Terminal Environments for Learning Language Agents

Paper page - LiteCoder-Terminal: Scaling Long-Horizon Terminal Environments for Learning Language Agents

Abstract

Models citing this paper2

Datasets citing this paper2

Spaces citing this paper0

Collections including this paper0

Similar Articles

Terminal-World: Scaling Terminal-Agent Environments via Agent Skills

What Makes Interaction Trajectories Effective for Training Terminal Agents?

Turning local agents into self-optimizing agents

A Self-Evolving Framework for Efficient Terminal Agents via Observational Context Compression

EndPrompt: Efficient Long-Context Extension via Terminal Anchoring

Submit Feedback

Similar Articles

Terminal-World: Scaling Terminal-Agent Environments via Agent Skills

What Makes Interaction Trajectories Effective for Training Terminal Agents?

Turning local agents into self-optimizing agents

A Self-Evolving Framework for Efficient Terminal Agents via Observational Context Compression

EndPrompt: Efficient Long-Context Extension via Terminal Anchoring