When to Trust Imagination: Adaptive Action Execution for World Action Models

Hugging Face Daily Papers 05/07/26, 12:00 AM Papers

Summary

This paper introduces FFDC, a lightweight verifier for World Action Models that enables adaptive action chunk sizes by checking consistency between predicted and actual observations, improving efficiency and robustness in robotic manipulation.

World Action Models (WAMs) have recently emerged as a promising paradigm for robotic manipulation by jointly predicting future visual observations and future actions. However, current WAMs typically execute a fixed number of predicted actions after each model inference, leaving the robot blind to whether the imagined future remains consistent with the actual physical rollout. In this work, we formulate adaptive WAM execution as a future-reality verification problem: the robot should execute longer when the WAM-predicted future remains reliable, and replan earlier when reality deviates from imagination. To this end, we propose Future Forward Dynamics Causal Attention (FFDC), a lightweight verifier that jointly reasons over predicted future actions, predicted visual dynamics, real observations, and language instructions to estimate whether the remaining action rollout can still be trusted. FFDC enables adaptive action chunk sizes as an emergent consequence of prediction-observation consistency, preserving the efficiency of long-horizon execution while restoring responsiveness in contact-rich or difficult phases. We further introduce Mixture-of-Horizon Training to improve long-horizon trajectory coverage for adaptive execution. Experiments on the RoboTwin benchmark and in the real world demonstrate that our method achieves a strong robustness-efficiency trade-off: on RoboTwin, it reduces WAM forward passes by 69.10% and execution time by 34.02%, while improving success rate by 2.54% over the short-chunk baseline; in real-world experiments, it improves success rate by 35%.

Original Article

View Cached Full Text

Cached at: 05/08/26, 06:58 AM

Paper page - When to Trust Imagination: Adaptive Action Execution for World Action Models

Source: https://huggingface.co/papers/2605.06222 Published on May 7

Submitted byhttps://huggingface.co/linjhong

Linon May 8

Abstract

WorldActionModels(WAMs)haverecentlyemergedasapromisingparadigmforroboticmanipulationbyjointlypredictingfuturevisualobservationsandfutureactions.However,currentWAMstypicallyexecuteafixednumberofpredictedactionsaftereachmodelinference,leavingtherobotblindtowhethertheimaginedfutureremainsconsistentwiththeactualphysicalrollout.Inthiswork,weformulateadaptiveWAMexecutionasafuture-realityverificationproblem:therobotshouldexecutelongerwhentheWAM-predictedfutureremainsreliable,andreplanearlierwhenrealitydeviatesfromimagination.Tothisend,weproposeFutureForwardDynamicsCausalAttention(FFDC),alightweightverifierthatjointlyreasonsoverpredictedfutureactions,predictedvisualdynamics,realobservations,andlanguageinstructionstoestimatewhethertheremainingactionrolloutcanstillbetrusted.FFDCenablesadaptiveactionchunksizesasanemergentconsequenceofprediction-observationconsistency,preservingtheefficiencyoflong-horizonexecutionwhilerestoringresponsivenessincontact-richordifficultphases.WefurtherintroduceMixture-of-HorizonTrainingtoimprovelong-horizontrajectorycoverageforadaptiveexecution.ExperimentsontheRoboTwinbenchmarkandintherealworlddemonstratethatourmethodachievesastrongrobustness-efficiencytrade-off:onRoboTwin,itreducesWAMforwardpassesby69.10%andexecutiontimeby34.02%,whileimprovingsuccessrateby2.54%overtheshort-chunkbaseline;inreal-worldexperiments,itimprovessuccessrateby35%.

View arXiv page View PDF Add to collection

Get this paper in your agent:

hf papers read 2605\.06222

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2605.06222 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2605.06222 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2605.06222 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

When to Trust Imagination: Adaptive Action Execution for World Action Models

Paper page - When to Trust Imagination: Adaptive Action Execution for World Action Models

Abstract

Models citing this paper0

Datasets citing this paper0

Spaces citing this paper0

Collections including this paper0

Similar Articles

Light-WAM: Efficient World Action Models with State-Fusion Action Decoding

AHA-WAM:Asynchronous Horizon-Adaptive World-Action Modeling with Observation-Guided Context Routing

World Action Models: The Next Frontier in Embodied AI

ActWorld: From Explorable to Interactive World Model via Action-Aware Memory

Foresight: Failure Detection for Long-Horizon Robotic Manipulation with Action-Conditioned World Model Latents

Submit Feedback

Similar Articles

Light-WAM: Efficient World Action Models with State-Fusion Action Decoding

AHA-WAM:Asynchronous Horizon-Adaptive World-Action Modeling with Observation-Guided Context Routing

World Action Models: The Next Frontier in Embodied AI

ActWorld: From Explorable to Interactive World Model via Action-Aware Memory

Foresight: Failure Detection for Long-Horizon Robotic Manipulation with Action-Conditioned World Model Latents