Silent Failures in Physical AI: A Literature Review of Runtime Action Authorization for Autonomous Systems

Hugging Face Daily Papers 05/23/26, 12:00 AM Papers

physical-ai runtime-guardrails autonomous-systems safety silent-failures literature-review

Summary

This literature review identifies and analyzes the problem of silent failures in physical AI systems, where black-box models may execute harmful actions without detection. It proposes a taxonomy of runtime guardrail functions and outlines evaluation requirements for safe autonomous systems.

Physical AI systems increasingly map multimodal observations, language instructions, and learned world representations into physically consequential actions. Robotics foundation models, vision-language-action models, and world-model-based autonomous systems can condition decisions that move vehicles, robots, drones, and industrial machines. This transition exposes a safety problem that is not fully captured by conventional AI content moderation or by classical robot safety alone: a black-box model may issue a physically consequential action while appearing confident, plausible, and semantically aligned. The resulting failure can be silent, arising from sensor drift, occlusion, state-estimation error, distribution shift, hallucinated affordances, or invalid physical assumptions before downstream hardware controllers detect a violation. Across embodied foundation models, world models, robotics simulation, embodied safety benchmarks, safe control, runtime assurance, uncertainty estimation, verification, and guardrail evaluation, model capability and safety mechanisms have advanced along largely separate technical tracks. A recurring gap synthesized here is that no single stream surveyed in this review supplies a complete runtime authorization boundary between black-box Physical AI models and physical execution. The resulting analysis develops a bounded problem formulation, a definition of silent physical-action failure, a taxonomy of runtime guardrail functions, and evaluation requirements for comparing guardrails as Physical AI assurance mechanisms.

Original Article

View Cached Full Text

Cached at: 06/02/26, 03:36 PM

Paper page - Silent Failures in Physical AI: A Literature Review of Runtime Action Authorization for Autonomous Systems

Source: https://huggingface.co/papers/2606.00090

Abstract

Physical AI systems face safety challenges where black-box models can execute harmful actions without detection, necessitating comprehensive runtime guardrail mechanisms for safe operation.

Physical AI systemsincreasingly map multimodal observations, language instructions, and learned world representations into physically consequential actions. Robotics foundation models, vision-language-action models, and world-model-based autonomous systems can condition decisions that move vehicles, robots, drones, and industrial machines. This transition exposes a safety problem that is not fully captured by conventional AI content moderation or by classical robot safety alone: a black-box model may issue a physically consequential action while appearing confident, plausible, and semantically aligned. The resulting failure can be silent, arising from sensor drift, occlusion, state-estimation error, distribution shift, hallucinated affordances, or invalid physical assumptions before downstream hardware controllers detect a violation. Acrossembodied foundation models,world models,robotics simulation,embodied safety benchmarks,safe control,runtime assurance,uncertainty estimation,verification, andguardrail evaluation, model capability and safety mechanisms have advanced along largely separate technical tracks. A recurring gap synthesized here is that no single stream surveyed in this review supplies a complete runtime authorization boundary between black-box Physical AI models and physical execution. The resulting analysis develops a bounded problem formulation, a definition ofsilent physical-action failure, a taxonomy ofruntime guardrail functions, and evaluation requirements for comparing guardrails as Physical AI assurance mechanisms.

View arXiv page View PDF Add to collection

Get this paper in your agent:

hf papers read 2606\.00090

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2606.00090 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2606.00090 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2606.00090 in a Space README.md to link it from this page.

Silent Failures in Physical AI: A Literature Review of Runtime Action Authorization for Autonomous Systems

Paper page - Silent Failures in Physical AI: A Literature Review of Runtime Action Authorization for Autonomous Systems

Abstract

Models citing this paper0

Datasets citing this paper0

Spaces citing this paper0

Collections including this paper1

Similar Articles

Runtime Governance: The Missing Layer for AI Agents in 2026

The Containment Gap: How Deployed Agentic AI Frameworks Fail Public-Facing Safety Requirements

The most dangerous part of AI agents begins when they receive authority

Concrete AI safety problems

AI safety is arguing about the wrong boundary

Submit Feedback

Similar Articles

Runtime Governance: The Missing Layer for AI Agents in 2026

The Containment Gap: How Deployed Agentic AI Frameworks Fail Public-Facing Safety Requirements

The most dangerous part of AI agents begins when they receive authority

AI safety is arguing about the wrong boundary