Open weights are not enough: we need open training frameworks for research and better algorithms [P]

Reddit r/MachineLearning 06/15/26, 06:37 PM Tools

open-source reinforcement-learning training-framework llm vlm fine-tuning research

Summary

A call for open training frameworks in AI research, introducing FeynRL, a modular and explicit framework for RL post-training of LLMs, VLMs, and agents, designed to make training processes visible and modifiable.

Open weights are important and critical, but they are not enough by themselves. If we want open ML and AI research to move forward, we also need open training frameworks: codebases that do more than run jobs. They should make the training process visible, understandable, and modifiable, so researchers/engineers/practitioner can build new algorithms instead of fighting hidden systems. That was the motivation behind FeynRL (pronounced “FineRL”) a framework I built for RL post-training of LLMs, VLMs, and agents. RL is already hard to make work. With LLMs, VLM, and agents, it becomes even messier: rollout engines, reward computation, distributed training, weight syncing, credit assignment problems, long-horizon behavior, and many small implementation details that can quietly break everything. The core idea behind FeynRL is simple: ***algorithms should stay algorithms, systems should stay systems****, and researchers/engineers/practitioner should be able to understand the full training loop end-to-end without spending days or weeks.* GitHub: [https://github.com/FeynRL-project/FeynRL](https://github.com/FeynRL-project/FeynRL) The framework is designed to keep the framework explicit: from data loading and rollout generation to reward computation, loss construction, optimization, and evaluation. The goal is to make it easier to develop new algorithms, training recipes, reward designs, rollout strategies, and optimization methods without going through a convoluted hidden system. The framework currently includes examples for SFT, DPO, and RL-style post-training for both vllm and llm, with support for single-GPU, multi-GPU, and cluster setups. Would love feedback, issues, suggestions. Also, curious to hear what parts of RL post-training infrastructure people still find too hidden, hard to debug, or hard to modify.

Original Article

Open weights are not enough: we need open training frameworks for research and better algorithms [P]

Similar Articles

@charles_irl: Proper post-training RL, deployed broadly, is a key step towards a future where software systems quietly improve themse…

Open weights are quietly closing up - and that's a problem

Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries

@TheTuringPost: 10 open-source tools for the Agent RL stack ↓ OpenPipe ART verl-agent Agent Lightning Unsloth OpenRLHF SkyRL NVIDIA’s P…

@nanjiangwill: At @modal, we're working to make sure OSS RL frameworks have all the techniques necessary to train frontier open-weight…

Submit Feedback

Similar Articles

@charles_irl: Proper post-training RL, deployed broadly, is a key step towards a future where software systems quietly improve themse…

Open weights are quietly closing up - and that's a problem

Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries

@TheTuringPost: 10 open-source tools for the Agent RL stack ↓ OpenPipe ART verl-agent Agent Lightning Unsloth OpenRLHF SkyRL NVIDIA’s P…

@nanjiangwill: At @modal, we're working to make sure OSS RL frameworks have all the techniques necessary to train frontier open-weight…