@adithya_s_k: You can now finetune models on agent traces directly with TRL Claude Code traces Codex traces OpenClaw traces Pi traces…
Summary
TRL now supports fine-tuning models on agent traces from various sources like Claude Code, Codex, OpenClaw, and Pi, moving towards a standardized stack for training agentic models.
View Cached Full Text
Cached at: 06/05/26, 11:14 AM
You can now finetune models on agent traces directly with TRL
✅ Claude Code traces ✅ Codex traces ✅ OpenClaw traces ✅ Pi traces … many more
Feels like we’re getting closer to a standard stack for finetuning agentic models. 🤗
Quentin Lhoest 🤗 (@lhoestq): Agent traces are the new fuel.
Looking fw to announce
trlofficial support for agent traces for training💥(w/
datasetsv5, coming out tmr?)Pick your local, synthetic, or community traces and train your own specialized Agent
🔜trl sft –dataset-name julien-c/synthtraces
Similar Articles
@benhylak: we built the first sane way to debug your agent locally. you can see your traces. codex/claude code can too. this lets …
A new open source tool enables local debugging of AI agents by viewing traces, allowing automated eval writing and testing with tools like codex and Claude code.
@ClementDelangue: We need open traces so that everyone can train open agent models! cc @steipete @badlogicgames @thdxr @matanSF @hwchase17
Clement Delangue advocates for open traces to democratize training of open agent models.
@ShaokunZhang1: Want to train your own Claude Code/Codex agent with your own model? We are excited to roll out ProRL Agent V2: Polar. A…
NVIDIA releases Polar, an open-source infrastructure for black-box agentic reinforcement learning, enabling training of coding agents like Claude Code or Codex with any agent harness or framework.
@shao__meng: Why do Claude Code, Cursor, Codex, Aider, and Cline exhibit different agent behaviors despite potentially sharing the same underlying models? @addyosmani argues: It's due to the "shell" above the model — the Harness, which includes "prompts, ...
The article discusses how Addy Osmani argues that the performance difference between AI coding agents like Claude Code, Cursor, and Cline stems from their 'Harness'—the layer of prompts, tools, and constraints around the model—rather than the underlying model itself. It details best practices for harness engineering, including hooks, sandboxing, and context management, to bridge the gap between model capability and actual agent performance.
@julien_c: Today I'm launching a new project called SynthTraces It is a minimal codebase to generate synthetic coding agent sessio…
Julien Chaumond launches SynthTraces, a minimal codebase that generates synthetic coding agent session traces by having an open model (via HF Inference Providers) interact with a small local model (via llama.cpp) on real open-source codebases, producing over 2,000 Pi session traces for training and fine-tuning LLMs.