KinDER: A Physical Reasoning Benchmark for Robot Learning and Planning

Hugging Face Daily Papers 05/04/26, 12:00 AM Papers

robotics physical-reasoning benchmark reinforcement-learning motion-planning simulation

Summary

KinDER is a new open-source benchmark for physical reasoning in robotics, featuring procedurally generated environments and baselines to evaluate kinematic and dynamic constraint challenges.

Robotic systems that interact with the physical world must reason about kinematic and dynamic constraints imposed by their own embodiment, their environment, and the task at hand. We introduce KinDER, a benchmark for Kinematic and Dynamic Embodied Reasoning that targets physical reasoning challenges arising in robot learning and planning. KinDER comprises 25 procedurally generated environments, a Gymnasium-compatible Python library with parameterized skills and demonstrations, and a standardized evaluation suite with 13 implemented baselines spanning task and motion planning, imitation learning, reinforcement learning, and foundation-model-based approaches. The environments are designed to isolate five core physical reasoning challenges: basic spatial relations, nonprehensile multi-object manipulation, tool use, combinatorial geometric constraints, and dynamic constraints, disentangled from perception, language understanding, and application-specific complexity. Empirical evaluation shows that existing methods struggle to solve many of the environments, indicating substantial gaps in current approaches to physical reasoning. We additionally include real-to-sim-to-real experiments on a mobile manipulator to assess the correspondence between simulation and real-world physical interaction. KinDER is fully open-sourced and intended to enable systematic comparison across diverse paradigms for advancing physical reasoning in robotics. Website and code: https://prpl-group.com/kinder-site/

Original Article Export to Word Export to PDF

View Cached Full Text

Cached at: 05/08/26, 07:47 AM

Paper page - KinDER: A Physical Reasoning Benchmark for Robot Learning and Planning

Source: https://huggingface.co/papers/2604.25788 Authors:

Abstract

KinDER is a benchmark for physical reasoning in robotics that includes procedurally generated environments and baselines spanning multiple learning paradigms to address kinematic and dynamic constraint challenges.

Robotic systemsthat interact with the physical world must reason about kinematic anddynamic constraintsimposed by their own embodiment, their environment, and the task at hand. We introduce KinDER, a benchmark for Kinematic and DynamicEmbodied Reasoningthat targetsphysical reasoningchallenges arising in robot learning and planning. KinDER comprises 25 procedurally generated environments, a Gymnasium-compatible Python library with parameterized skills and demonstrations, and a standardized evaluation suite with 13 implemented baselines spanning task andmotion planning,imitation learning,reinforcement learning, and foundation-model-based approaches. The environments are designed to isolate five corephysical reasoningchallenges: basic spatial relations, nonprehensile multi-object manipulation, tool use, combinatorial geometric constraints, anddynamic constraints, disentangled from perception, language understanding, and application-specific complexity. Empirical evaluation shows that existing methods struggle to solve many of the environments, indicating substantial gaps in current approaches tophysical reasoning. We additionally includereal-to-sim-to-real experimentson a mobile manipulator to assess the correspondence between simulation and real-world physical interaction. KinDER is fully open-sourced and intended to enable systematic comparison across diverse paradigms for advancingphysical reasoningin robotics. Website and code: https://prpl-group.com/kinder-site/

View arXiv page View PDF Project page GitHub25 Add to collection

Get this paper in your agent:

hf papers read 2604\.25788

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper4

#### kinder-bench/kinder-openpi-checkpoints Robotics• Updatedabout 12 hours ago #### kinder-bench/kinder-DP-checkpoints Robotics• Updatedabout 12 hours ago #### kinder-bench/kinder-DPES-checkpoints Robotics• Updatedabout 12 hours ago #### kinder-bench/kinder-mbrl-checkpoints Robotics• Updatedabout 12 hours ago

Datasets citing this paper1

#### kinder-bench/kinder-datasets Updatedabout 12 hours ago • 39

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2604.25788 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

KinDER: A Physical Reasoning Benchmark for Robot Learning and Planning

Paper page - KinDER: A Physical Reasoning Benchmark for Robot Learning and Planning

Abstract

Models citing this paper4

Datasets citing this paper1

Spaces citing this paper0

Collections including this paper0

Similar Articles

RoboLab: A High-Fidelity Simulation Benchmark for Analysis of Task Generalist Policies

Safety Gym

ShapeCodeBench: A Renewable Benchmark for Perception-to-Program Reconstruction of Synthetic Shape Scenes

Benchmarking safe exploration in deep reinforcement learning

RoboMemArena: A Comprehensive and Challenging Robotic Memory Benchmark

Submit Feedback

Similar Articles

RoboLab: A High-Fidelity Simulation Benchmark for Analysis of Task Generalist Policies

ShapeCodeBench: A Renewable Benchmark for Perception-to-Program Reconstruction of Synthetic Shape Scenes

Benchmarking safe exploration in deep reinforcement learning

RoboMemArena: A Comprehensive and Challenging Robotic Memory Benchmark