stabilization

#stabilization

Diffusion Policy Optimization without Drifting Apart

arXiv cs.LG ↗ · 17h ago Cached

DiPOD stabilizes diffusion policy optimization by interleaving self-distillation with policy-gradient updates to maintain a tight ELBO, preventing the double-drift phenomenon and achieving higher rewards in both language and continuous control tasks.

0 favorites 0 likes

#stabilization

Predictive Assistance and the Temporal Dynamics of Exploratory Compression

arXiv cs.AI ↗ · 5d ago Cached

This paper develops a geometric dynamical framework to model how predictive AI assistance alters exploratory cognition by stabilizing trajectories before self-generated exploration, leading to reduced exploratory responsiveness, hysteresis, and delayed recovery upon assistance withdrawal.

0 favorites 0 likes

stabilization

Diffusion Policy Optimization without Drifting Apart

Predictive Assistance and the Temporal Dynamics of Exploratory Compression

Submit Feedback