gflownets

#gflownets

$f$-Trajectory Balance: A Loss Family for Tuning GFlowNets, Generative Models, and LLMs with Off- and On-Policy Data

arXiv cs.LG ↗ · 2026-05-18 Cached

This paper introduces a family of loss functions derived from f-divergences for training generative models like GFlowNets and LLMs, which are valid off-policy while matching on-policy gradients of the corresponding f-divergence. Applications include molecule discovery and asynchronous LLM tuning.

0 favorites 0 likes

#gflownets

Beyond Mode-Seeking RL: Trajectory-Balance Post-Training for Diffusion Language Models

arXiv cs.LG ↗ · 2026-05-15 Cached

This paper identifies a failure mode called 'trajectory locking' in reward-maximizing post-training for diffusion language models, and proposes TraFL, a trajectory-balance objective that improves diversity and performance across math and code benchmarks.

0 favorites 0 likes

gflownets

$f$-Trajectory Balance: A Loss Family for Tuning GFlowNets, Generative Models, and LLMs with Off- and On-Policy Data

Beyond Mode-Seeking RL: Trajectory-Balance Post-Training for Diffusion Language Models

Submit Feedback