SPIN: Structural LLM Planning via Iterative Navigation for Industrial Tasks
Summary
SPIN is a planning wrapper that ensures structurally valid DAG plans and uses prefix-based execution control to reduce task steps and tool calls in industrial LLM agent systems, improving plan validity and efficiency.
View Cached Full Text
Cached at: 05/15/26, 04:24 AM
Paper page - SPIN: Structural LLM Planning via Iterative Navigation for Industrial Tasks
Source: https://huggingface.co/papers/2605.14051
Abstract
SPIN is a planning wrapper that combines validated DAG planning with prefix-based execution control to reduce task execution and improve plan validity in industrial LLM agent systems.
IndustrialLLM agent systemsoften separate planning from execution, yet LLM planners frequently produce structurally invalid or unnecessarily long workflows, leading to brittle failures and avoidable tool and API cost. We propose SPIN, a planning wrapper that combines validatedDirected Acyclic Graph(DAG) planning with prefix based execution control. SPIN enforces a strict DAG contract through \_validate\_plan\_text andrepair prompting, producing executable plans before downstream execution, and then evaluates DAG prefixes incrementally to stop when the current prefix is sufficient to answer the query. On AssetOpsBench, across 261 scenarios, SPIN reduces executed tasks from 1061 to 623 and improves Accomplished from 0.638 to 0.706, while reducing tool calls from 11.81 to 6.82 per run. On MCP Bench, the same wrapper improves planning, grounding, and dependency related scores for both GPT OSS1 and Llama 4 Maverick.
View arXiv pageView PDFAdd to collection
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2605.14051 in a model README.md to link it from this page.
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2605.14051 in a dataset README.md to link it from this page.
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2605.14051 in a Space README.md to link it from this page.
Collections including this paper0
No Collection including this paper
Add this paper to acollectionto link it from this page.
Similar Articles
SPIN: Decentralized Swarm Control via Tensorized Policy Coordination
This paper introduces SPIN, a framework for decentralized multi-agent swarm control that uses tensor network factorization to reduce computational complexity from exponential to linear, enabling low-power edge deployment. It validates the approach in simulation for tracking, coverage, and coordination tasks.
HIPIF: Hierarchical Planning and Information Folding for Long-Horizon LLM Agent Learning
Introduces HIPIF, a method for training LLM agents to handle long-horizon tasks by hierarchical planning and information folding to reduce long-context interference, achieving strong results on three benchmarks.
SIMMER: Benchmarking Latent Failures in LLM Executable Planning with a World Model
Introduces Simmer, a benchmark for evaluating latent failures in LLM-generated executable plans using a human-curated symbolic world model in the kitchen domain. Experiments show frontier LLMs achieve at most 17% error-free plans, with up to 56% containing latent failures, and counterfactual foresight simulation reduces failures significantly.
PersonalAI 2.0: Enhancing knowledge graph traversal/retrieval with planning mechanism for Personalized LLM Agents
PersonalAI 2.0 introduces a framework that enhances LLM-based systems by integrating external knowledge graphs with dynamic multistage query processing and adaptive planning mechanisms, achieving reductions in hallucination rates and improved precision across multiple benchmarks.
From Human Guidance to Autonomy: Agent Skill System for End-to-End LLM Deployment on Spatial NPUs
This paper presents a two-stage methodology for end-to-end LLM deployment on spatial NPUs, progressing from human-guided development to an autonomous agent skill system. The system achieves speedups of 2.2x on prefill and 4.0x on decode for a reference model, and autonomously deploys eight additional LLMs on AMD XDNA 2 NPU with minimal human guidance.