Tag
Introduces StereoPolicy, a framework that leverages synchronized stereo image pairs to improve geometric reasoning for robot manipulation policies, avoiding the fragility of RGB-D and point clouds. It integrates with diffusion-based and vision-language-action policies, showing consistent improvements in simulation and real-world tasks.
Stanford University has released free online resources teaching high-income AI skills in 90 minutes, offering a significant advantage to early viewers.
This paper introduces BALAR, a training-free Bayesian agentic loop algorithm that enables large language models to actively reason and ask clarifying questions in multi-turn interactions. It demonstrates significant performance improvements over baselines on detective, puzzle, and clinical diagnosis benchmarks.