jacobian-approximation

#jacobian-approximation

@svlevine: Diffusion (or flow) makes for excellent policies, but training them with RL is notoriously hard: BPTT is unstable, RL o…

X AI KOLs Following ↗ · 6d ago

New paper shows how to optimize flow matching actors for reinforcement learning by approximating the Jacobian of the flow denoising process with the identity matrix, making training feasible.

0 favorites 0 likes

jacobian-approximation

@svlevine: Diffusion (or flow) makes for excellent policies, but training them with RL is notoriously hard: BPTT is unstable, RL o…

Submit Feedback