@_akhaliq: CHORUS Decentralized Multi-Embodiment Collaboration with One VLA Policy
Summary
CHORUS is a decentralized method enabling multiple robots with different embodiments to collaborate using a single Vision-Language-Action policy.
View Cached Full Text
Cached at: 06/12/26, 03:01 PM
CHORUS
Decentralized Multi-Embodiment Collaboration with One VLA Policy https://t.co/QIyWXsMdiL
Similar Articles
Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments
Qwen-VLA is a unified vision-language-action model for embodied decision-making, integrating manipulation, navigation, and trajectory prediction across different robot platforms. It uses a DiT-based action decoder and embodiment-aware prompt conditioning, achieving strong performance and out-of-distribution generalization.
AffordanceVLA: A Vision-Language-Action Model Empowering Action Generation through Affordance-Aware Understanding
AffordanceVLA introduces a unified framework using structured affordance forecasting as an intermediate representation to improve perception-action mapping in robotic manipulation, leveraging vision-language models and a Mixture-of-Transformer architecture.
D-VLA: A High-Concurrency Distributed Asynchronous Reinforcement Learning Framework for Vision-Language-Action Models
D-VLA proposes a high-concurrency distributed asynchronous reinforcement learning framework for Vision-Language-Action models, using plane decoupling and a swimlane pipeline to improve throughput and efficiency in large-scale embodied AI training.
Hy-Embodied-0.5-VLA: From Vision-Language-Action Models to a Real-World Robot Learning Stack
HyVLA-0.5 is an end-to-end robotic learning system that integrates data collection, model design, pre-training, fine-tuning, and reinforcement learning for real-world deployment.
HiVLA: A Visual-Grounded-Centric Hierarchical Embodied Manipulation System
HiVLA introduces a hierarchical vision-language-action framework that decouples semantic planning from motor control using a diffusion transformer action expert for improved robotic manipulation. The system combines a VLM planner for task decomposition and visual grounding with a specialized DiT action expert using cascaded cross-attention, outperforming end-to-end baselines particularly in long-horizon tasks and fine-grained manipulation.