q-steering

Tag

Cards List
#q-steering

QPILOTS: Efficient Test-Time Q-Steering for Flow Policies

arXiv cs.LG · 5d ago Cached

QPILOTS is a method that steers flow policies at inference time by using critic gradients projected from noisy intermediate states, achieving state-of-the-art performance on offline-to-online RL benchmarks and improving pretrained VLA models without modifying the base policy.

0 favorites 0 likes
← Back to home

Submit Feedback