jacobian-approximation

Tag

Cards List
#jacobian-approximation

@svlevine: Diffusion (or flow) makes for excellent policies, but training them with RL is notoriously hard: BPTT is unstable, RL o…

X AI KOLs Following · 6d ago

New paper shows how to optimize flow matching actors for reinforcement learning by approximating the Jacobian of the flow denoising process with the identity matrix, making training feasible.

0 favorites 0 likes
← Back to home

Submit Feedback