Differentiable Learning of Lifted Action Schemas for Classical Planning

arXiv cs.AI Papers

Summary

This paper introduces a neural network architecture that learns lifted action schemas from fully observed state traces with unobserved action arguments, aiming to enable robust learning of planning domains for neuro-symbolic models.

arXiv:2605.13282v1 Announce Type: new Abstract: Classical planners can effectively solve very large deterministic MDPs represented in STRIPS or PDDL where states are sets of atoms over objects and relations, and lifted action schemas add or delete these atoms. This compact representation yields strong search heuristics and provides an ideal setting for structural generalization, since lifted relations and action schemas give rise to infinitely many domain instances. A central challenge is to learn these relations and action schemas from data, and recent approaches have addressed this problem using different types of observations. In this work, we develop a novel neural network architecture for learning action schemas from traces where states are fully observed but action arguments are unobserved. The problem is a simplification but an important step towards learning planning domains from sequences of images and action labels, and we aim to solve this simplification in a nearly perfect manner. The challenge lies in learning the action schemas while simultaneously identifying the action arguments from observed state changes. Our approach yields a robust differentiable component that can then be integrated into larger neuro-symbolic models. We evaluate the architecture on various planning domains, where the learned lifted action schemas must recover the ground-truth structure. Additionally, we report experiments on robustness to observation noise and on a variation related to slot-based dynamics models.
Original Article
View Cached Full Text

Cached at: 05/14/26, 06:15 AM

# Differentiable Learning of Lifted Action Schemas for Classical Planning
Source: [https://arxiv.org/abs/2605.13282](https://arxiv.org/abs/2605.13282)
[View PDF](https://arxiv.org/pdf/2605.13282)

> Abstract:Classical planners can effectively solve very large deterministic MDPs represented in STRIPS or PDDL where states are sets of atoms over objects and relations, and lifted action schemas add or delete these atoms\. This compact representation yields strong search heuristics and provides an ideal setting for structural generalization, since lifted relations and action schemas give rise to infinitely many domain instances\. A central challenge is to learn these relations and action schemas from data, and recent approaches have addressed this problem using different types of observations\. In this work, we develop a novel neural network architecture for learning action schemas from traces where states are fully observed but action arguments are unobserved\. The problem is a simplification but an important step towards learning planning domains from sequences of images and action labels, and we aim to solve this simplification in a nearly perfect manner\. The challenge lies in learning the action schemas while simultaneously identifying the action arguments from observed state changes\. Our approach yields a robust differentiable component that can then be integrated into larger neuro\-symbolic models\. We evaluate the architecture on various planning domains, where the learned lifted action schemas must recover the ground\-truth structure\. Additionally, we report experiments on robustness to observation noise and on a variation related to slot\-based dynamics models\.

## Submission history

From: Jonas Reiter \[[view email](https://arxiv.org/show-email/ce83a3d6/2605.13282)\] **\[v1\]**Wed, 13 May 2026 09:59:49 UTC \(374 KB\)

Similar Articles

Neuro-Inspired Inverse Learning for Planning and Control

arXiv cs.AI

This paper introduces a neuro-inspired framework called Inverter that uses Inverse Learning (IL) for fast and efficient planning and control, achieving significant improvements on D4RL benchmarks and quantum gate synthesis with orders of magnitude less inference computation.

Agentic Transformers Provably Learn to Search via Reinforcement Learning

arXiv cs.LG

This paper theoretically studies how transformer-based policies acquire search capabilities from reinforcement learning training dynamics in a stochastic tree environment. It shows that a two-head transformer can implement depth-first search and that this mechanism emerges naturally from sparse reward signals under a depth-wise curriculum.