flow-models

Tag

Cards List
#flow-models

Test-Time Gradient Guidance of Flow Policies in Reinforcement Learning

Hugging Face Daily Papers · 6d ago Cached

QGF is an RL algorithm that improves policies at test time by using a value gradient to guide a pre-trained flow policy, avoiding training-time instability while maintaining competitive performance.

0 favorites 0 likes
#flow-models

Are we really tilting? The mechanics of reward guidance in flow and diffusion models

arXiv cs.LG · 2026-06-03 Cached

This paper explains the root cause of reward hacking in reward-guided flow and diffusion models, attributing it to finite-particle plug-in estimation of the Doob h-function, and proposes a reward damping schedule to correct within-mode bias without additional computational cost.

0 favorites 0 likes
#flow-models

Constrained Flow Optimization via Sequential Fine Tuning for Molecular Design

arXiv cs.LG · 2026-06-01 Cached

Introduces Constrained Flow Optimization (CFO), a framework for fine-tuning generative flow models to maximize rewards while satisfying constraints in molecular design, with theoretical guarantees and experimental validation.

0 favorites 0 likes
#flow-models

Conflict-Aware Additive Guidance for Flow Models under Compositional Rewards

arXiv cs.AI · 2026-05-22 Cached

The paper identifies off-manifold drift in guided flow models under compositional rewards and proposes Conflict-Aware Additive Guidance (CAR), a lightweight method that dynamically resolves gradient conflicts to improve generation fidelity without retraining.

0 favorites 0 likes
#flow-models

Flow-Direct: Feedback-Efficient and Reusable Guidance for Flow Models via Non-Parametric Guidance Field

arXiv cs.LG · 2026-05-19 Cached

Flow-Direct introduces a non-parametric guidance field for flow-based generative models that accumulates reward feedback persistently, improving feedback efficiency and enabling reuse of collected samples to guide generation for multiple objectives without additional reward evaluations.

0 favorites 0 likes
← Back to home

Submit Feedback