@SergioPaniego: you can now train @liquidai's LFM2-VL in TRL GRPO and RLOO included, with an example script

X AI KOLs Following 06/25/26, 04:07 PM Tools

reinforcement-learning training grpo rloo lfm2-vl liquid-ai trl

Summary

You can now train Liquid AI's LFM2-VL model using TRL's GRPO and RLOO methods, with an example script provided.

you can now train @liquidai's LFM2-VL in TRL GRPO and RLOO included, with an example script https://t.co/H65pK20Q7H

Original Article

View Cached Full Text

Cached at: 06/26/26, 02:10 PM

you can now train @liquidai’s LFM2-VL in TRL

GRPO and RLOO included, with an example script https://t.co/H65pK20Q7H

Similar Articles

Liquid AI releases LFM2.5-8B-A1B

Reddit r/LocalLLaMA

Liquid AI released LFM2.5-8B-A1B, an edge model with a 128K context window, 38T tokens of pre-training, and large-scale reinforcement learning, capable of tool calling and complex tasks while fitting on an entry-level laptop.

@SergioPaniego: https://x.com/SergioPaniego/status/2067270222671741360

X AI KOLs Timeline

OpenReward environments now integrate directly into TRL's GRPOTrainer via a single OpenRewardSpec, allowing zero-glue-code training against a catalog of RL environments. The integration is experimental and part of a broader effort to make environment and agent RL first-class in TRL.

@didier_lopes: Incredible how Z. ai literally has their RL infrastructure open source. The entire OPD post-training of GLM-5.2 took on…

X AI KOLs Following

Z. ai has open-sourced its RL infrastructure, the slime framework, which enabled efficient OPD post-training of GLM-5.2 in about two days. slime is an LLM post-training framework for RL scaling that integrates Megatron and SGLang, and has been battle-tested by frontier models like GLM, Qwen, DeepSeek, and Llama.

When you don't have a data center GPU

Reddit r/LocalLLaMA

LiquidAI releases LFM2.5-230M, a 230M parameter language model designed to run on limited hardware, with support for transformers, vLLM, and SGLang.

Developing open source LLM from ground up from pretrain - rlhf(PPO/GRPO)

Reddit r/LocalLLaMA

A developer shares progress on training a 7B parameter open source LLM from scratch using a DeepSeek architecture optimized for low VRAM, with the goal of democratizing AI development and eventually surpassing large proprietary models.

Similar Articles

Liquid AI releases LFM2.5-8B-A1B

@SergioPaniego: https://x.com/SergioPaniego/status/2067270222671741360

@didier_lopes: Incredible how Z. ai literally has their RL infrastructure open source. The entire OPD post-training of GLM-5.2 took on…

When you don't have a data center GPU

Developing open source LLM from ground up from pretrain - rlhf(PPO/GRPO)

Submit Feedback