Tag
TRL v1.4 is released, featuring chunked NLL loss for SFT to reduce VRAM usage and first-class integration with OpenReward for GRPO.
Hugging Face releases TRL v1.0, a major update to its post-training library that transforms it from a research codebase into a stable, production-ready tool supporting over 75 training methods like PPO and DPO.
Hugging Face publishes a comprehensive analysis of 16 open-source reinforcement learning libraries, examining architectural patterns for asynchronous RL training and presenting design lessons for TRL's async trainer to address generation bottlenecks and weight synchronization challenges.