Safety Gym

OpenAI Blog Tools

Summary

OpenAI introduces Safety Gym, a new benchmark environment and toolkit for studying constrained reinforcement learning and safe exploration. The platform features multiple robots and tasks designed to quantify and measure safe exploration through cost functions alongside reward functions.

We’re releasing Safety Gym, a suite of environments and tools for measuring progress towards reinforcement learning agents that respect safety constraints while training.
Original Article
View Cached Full Text

Cached at: 04/20/26, 02:46 PM

# Safety Gym Source: [https://openai.com/index/safety-gym/](https://openai.com/index/safety-gym/) The first step towards making progress on a problem like safe exploration is to quantify it: figure out what can be measured, and how going up or down on those metrics gets us closer to the desired outcome\. Another way to say it is that we need to pick a formalism for the safe exploration problem\. A formalism allows us to design algorithms that achieve our goals\. While there are several options, there is not yet a universal consensus in the field of safe exploration research about the right formalism\. We spent some time thinking about it, and the formalism we think makes the most sense to adopt is constrained reinforcement learning\. [Constrained RL⁠\(opens in a new window\)](https://www-sop.inria.fr/members/Eitan.Altman/TEMP/h.pdf)is like normal RL, but in addition to a reward function that the agent wants to maximize, environments have cost functions that the agent needs to constrain\. For example, consider an agent controlling a self\-driving car\. We would want to reward this agent for getting from point A to point B as fast as possible\. But naturally, we would also want to constrain the driving behavior to match traffic safety standards\. We think constrained RL may turn out to be more useful than normal RL for ensuring that agents satisfy safety requirements\. A big problem with normal RL is that everything about the agent’s eventual behavior is described by the reward function, but reward design is fundamentally hard\. A key part of the challenge comes from picking trade\-offs between competing objectives, such as task performance and satisfying safety requirements\. In constrained RL, we don’t have to pick trade\-offs—instead, we pick outcomes, and let algorithms figure out the trade\-offs that get us the outcomes we want\. We can use the self\-driving car case to sketch what this means in practice\. Suppose the car earns some amount of money for every trip it completes, and has to pay a fine for every collision\. In normal RL, you would pick the collision fine at the beginning of training and keep it fixed forever\. The problem here is that if the pay\-per\-trip is high enough, the agent may not care whether it gets in lots of collisions \(as long as it can still complete its trips\)\. In fact, it may even be advantageous to drive recklessly and risk those collisions in order to get the pay\. We have seen this before when training[unconstrained RL agents⁠](https://openai.com/index/faulty-reward-functions/)\. By contrast, in constrained RL you would pick the acceptable collision rate at the beginning of training, and adjust the collision fine until the agent is meeting that requirement\. If the car is getting in too many fender\-benders, you raise the fine until that behavior is no longer incentivized\. To study constrained RL for safe exploration, we developed a new set of environments and tools called Safety Gym\. By comparison to existing environments for constrained RL, Safety Gym environments are richer and feature a wider range of difficulty and complexity\. In all Safety Gym environments, a robot has to navigate through a cluttered environment to achieve a task\. There are three pre\-made robots \(Point, Car, and Doggo\), three main tasks \(Goal, Button, and Push\), and two levels of difficulty for each task\. We give an overview of the robot\-task combinations below, but make sure to check out[the paper⁠\(opens in a new window\)](https://cdn.openai.com/safexp-short.pdf)for details\. In these videos, we show how an agent without constraints tries to solve these environments\. Every time the robot does something unsafe—which here, means running into clutter—a red warning light flashes around the agent, and the agent incurs a cost \(separate from the task reward\)\. Because these agents are unconstrained, they often wind up behaving unsafely while trying to maximize reward\. **Point**is a simple robot constrained to the 2D plane, with one actuator for turning and another for moving forward or backward\. Point has a front\-facing small square which helps with the Push task\.

Similar Articles

Benchmarking safe exploration in deep reinforcement learning

OpenAI Blog

OpenAI proposes standardizing constrained RL as the formalism for safe exploration and introduces Safety Gym, a benchmark suite for evaluating safe deep RL algorithms in high-dimensional continuous control tasks with safety constraints.

OpenAI Gym Beta

OpenAI Blog

OpenAI releases OpenAI Gym, a public beta toolkit for developing and comparing reinforcement learning algorithms with a growing suite of environments and a platform for reproducible research. The toolkit aims to standardize RL benchmarks and address the lack of diverse, easy-to-use environments for the research community.

Gym Retro

OpenAI Blog

OpenAI releases Gym Retro, a reinforcement learning research environment featuring games from classic gaming consoles (Sega Genesis, NES, SNES, Game Boy, etc.) to study agent generalization across different games and levels.

Roboschool

OpenAI Blog

OpenAI releases Roboschool, an open-source robot simulation environment integrated with OpenAI Gym featuring twelve environments including enhanced humanoid locomotion tasks and multi-agent settings like Pong.

OpenAI safety practices

OpenAI Blog

OpenAI outlines 10 safety practices it actively uses and improves upon, including empirical red-teaming, alignment research, abuse monitoring, and voluntary commitments shared at the AI Seoul Summit. The company emphasizes a balanced, scientific approach to safety integrated into development from the outset.