@adithya_s_k: You can now finetune models on agent traces directly with TRL Claude Code traces Codex traces OpenClaw traces Pi traces…

X AI KOLs Following 06/04/26, 03:14 PM Tools

finetuning agent-traces trl huggingface sft training

Summary

TRL now supports fine-tuning models on agent traces from various sources like Claude Code, Codex, OpenClaw, and Pi, moving towards a standardized stack for training agentic models.

You can now finetune models on agent traces directly with TRL ✅ Claude Code traces ✅ Codex traces ✅ OpenClaw traces ✅ Pi traces ... many more Feels like we're getting closer to a standard stack for finetuning agentic models. 🤗

Original Article

View Cached Full Text

Cached at: 06/05/26, 11:14 AM

You can now finetune models on agent traces directly with TRL

✅ Claude Code traces ✅ Codex traces ✅ OpenClaw traces ✅ Pi traces … many more

Feels like we’re getting closer to a standard stack for finetuning agentic models. 🤗

Quentin Lhoest 🤗 (@lhoestq): Agent traces are the new fuel.

Looking fw to announce trl official support for agent traces for training💥

(w/ datasets v5, coming out tmr?)

Pick your local, synthetic, or community traces and train your own specialized Agent

🔜trl sft –dataset-name julien-c/synthtraces

Similar Articles

@benhylak: we built the first sane way to debug your agent locally. you can see your traces. codex/claude code can too. this lets …

X AI KOLs Timeline

A new open source tool enables local debugging of AI agents by viewing traces, allowing automated eval writing and testing with tools like codex and Claude code.

@ClementDelangue: We need open traces so that everyone can train open agent models! cc @steipete @badlogicgames @thdxr @matanSF @hwchase17

X AI KOLs Following

Clement Delangue advocates for open traces to democratize training of open agent models.

I built a tool to turn your Claude Code sessions into fine-tuning data for local models

Reddit r/LocalLLaMA

A new open-source tool called claude_converter converts Claude Code session logs into fine-tuning datasets compatible with TRL/SFTTrainer, Axolotl, and LLaMA-Factory, enabling developers to repurpose real coding conversations for training local models.

@ShaokunZhang1: Want to train your own Claude Code/Codex agent with your own model? We are excited to roll out ProRL Agent V2: Polar. A…

X AI KOLs Timeline

NVIDIA releases Polar, an open-source infrastructure for black-box agentic reinforcement learning, enabling training of coding agents like Claude Code or Codex with any agent harness or framework.

Getting Better at Working With You: Compiling User Corrections into Runtime Enforcement for Coding Agents

Hugging Face Daily Papers

TRACE is a skill-layer pipeline that mines user corrections from interactive coding agents to compile runtime checks, reducing repeated preference violations significantly better than memory alone, as demonstrated on ClawArena and MemoryArena tasks.