Tag
A new method for off-policy reinforcement learning with diffusion models, using flow reversal to handle off-policy data by reversing the diffusion process on it.
Proposes treating flow steps as RL actions combined with a 'flow reversal' technique for flow offline reinforcement learning.