MuJoCo-Drones-Gym: A GPU-Accelerated Multi-Drone Simulator for Control and Reinforcement Learning
Summary
This paper presents MuJoCo-Drones-Gym, a GPU-accelerated multi-drone simulator built on MuJoCo that supports flexible physics models, action interfaces, and observation spaces for reinforcement learning and control research.
View Cached Full Text
Cached at: 06/12/26, 06:51 AM
Paper page - MuJoCo-Drones-Gym: A GPU-Accelerated Multi-Drone Simulator for Control and Reinforcement Learning
Source: https://huggingface.co/papers/2606.08039
Abstract
A Gymnasium-compatible multi-drone simulation environment built on MuJoCo physics engine that supports flexible physics models, action interfaces, and observation spaces for reinforcement learning applications.
Robotic simulators are a cornerstone of modern research in aerial robotics, serving both as a vehicle for the development of new control algorithms and as the data source for trainingreinforcement learning(RL) policies. Yet, existing quadcopter learning environments often face a trade-off between physical fidelity, multi-agent support, and the throughput required by modern deep RL pipelines. In this paper, we present MuJoCo-Drones-Gym, an open-sourceGymnasium-compatiblemulti-drone environmentbuilt on top of theMuJoCo physics engine. MuJoCo-Drones-Gym supports an arbitrary number of Bitcraze Crazyflie 2.x nano-quadcopters and exposes a modular API for selecting (i)~thephysics model(rigid-body MuJoCo, explicit Python dynamics, or any subset of ground effect, blade drag, and inter-drone downwash), (ii)~theaction interface(per-motor RPMs, collective normalized thrust, velocity setpoints, or PID waypoint commands), and (iii)~theobservation space(kinematic state vectors, RGB / depth / segmentation cameras, or neighbourhood adjacency information). APettingZoo ParallelEnvwrapper enables drop-inmulti-agent reinforcement learning, while a suite of seven task environments, hover, velocity tracking, multi-drone hover, waypoint navigation, formation flight, gate racing, and a generic multi-agent template, demonstrates the breadth of the interface. We describe the environment design, the underlying physics andquadcopter dynamics, and illustrate its use through control and learning examples that mirror those of the closely related gym-pybullet-drones project, while taking advantage of MuJoCo’s improved contact handling, rendering, and parallelizability.
View arXiv pageView PDFGitHub7Add to collection
Get this paper in your agent:
hf papers read 2606\.08039
Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2606.08039 in a model README.md to link it from this page.
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2606.08039 in a dataset README.md to link it from this page.
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2606.08039 in a Space README.md to link it from this page.
Collections including this paper0
No Collection including this paper
Add this paper to acollectionto link it from this page.
Similar Articles
MuJoCo – Advanced Physics Simulation
Google DeepMind maintains MuJoCo, a high-performance open-source physics engine with C/Python APIs and Unity plugin for robotics and ML research.
Faster physics in Python
OpenAI open-sources mujoco-py, a high-performance Python library for robotic simulation using the MuJoCo engine, featuring ~40x speedup with headless GPU rendering and VR interaction support.
DexJoCo: A Benchmark and Toolkit for Task-Oriented Dexterous Manipulation on MuJoCo
DexJoCo introduces a benchmark and toolkit for task-oriented dexterous manipulation in MuJoCo, featuring 11 functional tasks, a low-cost data collection system, and comprehensive evaluations that highlight limitations in current dexterous manipulation policies.
@guanqi_he: We release Wuji MJLab, an open-source MuJoCo environment for dexterous hand manipulation. It includes a cube reorientat…
Wuji MJLab is an open-source MuJoCo environment for dexterous hand manipulation, featuring a cube reorientation task, sim2real pipeline, and deployment on the Wuji Hand. It is based on mjlab and includes pretrained PPO policies.
DiffAero: A GPU-Accelerated Differentiable Simulation Framework for Efficient Quadrotor Policy Learning
DiffAero is a GPU-accelerated, fully differentiable simulation framework for quadrotor control policy learning that supports environment- and agent-level parallelism, multiple dynamics models, and customizable sensors. It enables robust flight policy learning in hours on consumer-grade hardware and is released as open-source.