Tag
KVPO introduces an ODE-native online GRPO framework that aligns streaming autoregressive video generators with human preferences using causal-semantic KV cache exploration and a velocity-field surrogate policy, achieving consistent improvements in visual quality and alignment.