Tag
This research paper explores methods for recovering hidden rewards within diffusion-based policies, likely aiming to improve the alignment or efficiency of such models.