flow-based-models

Tag

Cards List
#flow-based-models

Moment Matching Q-Learning

arXiv cs.LG · 2026-05-29 Cached

Moment Matching Q-Learning (MoMa QL) uses maximum mean discrepancy to match all moment statistics for distribution-level convergence in offline RL, achieving computational efficiency and strong performance on D4RL tasks.

0 favorites 0 likes
← Back to home

Submit Feedback