Tag
DiPOD stabilizes diffusion policy optimization by interleaving self-distillation with policy-gradient updates to maintain a tight ELBO, preventing the double-drift phenomenon and achieving higher rewards in both language and continuous control tasks.
Introduces StereoPolicy, a framework that leverages synchronized stereo image pairs to improve geometric reasoning for robot manipulation policies, avoiding the fragility of RGB-D and point clouds. It integrates with diffusion-based and vision-language-action policies, showing consistent improvements in simulation and real-world tasks.
This paper introduces Parameterized Diffusion Policy (PDP), a framework that makes diffusion policies controllable by conditioning on low-dimensional latent parameters, enabling smooth behavior interpolation and adaptation without retraining. It demonstrates improved performance on complex multimodal robot tasks in simulation and real-world experiments.
Proposes Model-Based Diffusion Policy Optimization (MBDPO), a framework that unifies search and policy optimization in world models using diffusion policy representations, achieving consistent scaling behavior and superior performance across offline and online reinforcement learning tasks.
This paper introduces the Frequency Guidance Operator (FGO), a method for diffusion policies that smooths action generation by steering noisy samples through intermediate sub-frequency manifolds, improving performance on robotic manipulation tasks.