@adithya_s_k: You can now finetune models on agent traces directly with TRL Claude Code traces Codex traces OpenClaw traces Pi traces…

X AI KOLs Following Tools

Summary

TRL now supports fine-tuning models on agent traces from various sources like Claude Code, Codex, OpenClaw, and Pi, moving towards a standardized stack for training agentic models.

You can now finetune models on agent traces directly with TRL ✅ Claude Code traces ✅ Codex traces ✅ OpenClaw traces ✅ Pi traces ... many more Feels like we're getting closer to a standard stack for finetuning agentic models. 🤗
Original Article
View Cached Full Text

Cached at: 06/05/26, 11:14 AM

You can now finetune models on agent traces directly with TRL

✅ Claude Code traces ✅ Codex traces ✅ OpenClaw traces ✅ Pi traces … many more

Feels like we’re getting closer to a standard stack for finetuning agentic models. 🤗

Quentin Lhoest 🤗 (@lhoestq): Agent traces are the new fuel.

Looking fw to announce trl official support for agent traces for training💥

(w/ datasets v5, coming out tmr?)

Pick your local, synthetic, or community traces and train your own specialized Agent

🔜trl sft –dataset-name julien-c/synthtraces

Similar Articles

@shao__meng: Why do Claude Code, Cursor, Codex, Aider, and Cline exhibit different agent behaviors despite potentially sharing the same underlying models? @addyosmani argues: It's due to the "shell" above the model — the Harness, which includes "prompts, ...

X AI KOLs Timeline

The article discusses how Addy Osmani argues that the performance difference between AI coding agents like Claude Code, Cursor, and Cline stems from their 'Harness'—the layer of prompts, tools, and constraints around the model—rather than the underlying model itself. It details best practices for harness engineering, including hooks, sandboxing, and context management, to bridge the gap between model capability and actual agent performance.