Tag
Polar is an agent RL rollout infrastructure that allows using real-world harnesses as training environments without code changes, supporting models like Codex, Claude Code, OpenClaw, and Hermes.
The tweet highlights a paper by the Meituan team on Skill0, an RL recipe for skill internalization, and references a related paper on self-distilled agentic RL.