flow-models

#flow-models

Test-Time Gradient Guidance of Flow Policies in Reinforcement Learning

Hugging Face Daily Papers ↗ · 6d ago Cached

QGF is an RL algorithm that improves policies at test time by using a value gradient to guide a pre-trained flow policy, avoiding training-time instability while maintaining competitive performance.

0 favorites 0 likes

#flow-models

Are we really tilting? The mechanics of reward guidance in flow and diffusion models

arXiv cs.LG ↗ · 2026-06-03 Cached

This paper explains the root cause of reward hacking in reward-guided flow and diffusion models, attributing it to finite-particle plug-in estimation of the Doob h-function, and proposes a reward damping schedule to correct within-mode bias without additional computational cost.

0 favorites 0 likes

#flow-models

Constrained Flow Optimization via Sequential Fine Tuning for Molecular Design

arXiv cs.LG ↗ · 2026-06-01 Cached

Introduces Constrained Flow Optimization (CFO), a framework for fine-tuning generative flow models to maximize rewards while satisfying constraints in molecular design, with theoretical guarantees and experimental validation.

0 favorites 0 likes

#flow-models

Conflict-Aware Additive Guidance for Flow Models under Compositional Rewards

arXiv cs.AI ↗ · 2026-05-22 Cached

The paper identifies off-manifold drift in guided flow models under compositional rewards and proposes Conflict-Aware Additive Guidance (CAR), a lightweight method that dynamically resolves gradient conflicts to improve generation fidelity without retraining.

0 favorites 0 likes

#flow-models

Flow-Direct: Feedback-Efficient and Reusable Guidance for Flow Models via Non-Parametric Guidance Field

arXiv cs.LG ↗ · 2026-05-19 Cached

Flow-Direct introduces a non-parametric guidance field for flow-based generative models that accumulates reward feedback persistently, improving feedback efficiency and enabling reuse of collected samples to guide generation for multiple objectives without additional reward evaluations.

0 favorites 0 likes

flow-models

Test-Time Gradient Guidance of Flow Policies in Reinforcement Learning

Are we really tilting? The mechanics of reward guidance in flow and diffusion models

Constrained Flow Optimization via Sequential Fine Tuning for Molecular Design

Conflict-Aware Additive Guidance for Flow Models under Compositional Rewards

Flow-Direct: Feedback-Efficient and Reusable Guidance for Flow Models via Non-Parametric Guidance Field

Submit Feedback