embodied-agents

#embodied-agents

Multi-scale Mixture of World Models for Embodied Agents in Evolving Environments

arXiv cs.AI ↗ · 7h ago Cached

This paper introduces MuSix, a framework for embodied agents that uses scale-aware world model mixture and evolution to handle multi-scale reasoning and dynamic adaptation in evolving environments, achieving improvements over baselines on EmbodiedBench and HAZARD.

0 favorites 0 likes

#embodied-agents

LabGuard: Grounding Natural-Language Laboratory Rules into Runtime Guards for Embodied Laboratory Agents

arXiv cs.AI ↗ · yesterday Cached

LabGuard introduces a framework that translates natural-language laboratory safety rules into executable runtime monitors for embodied agents, achieving a reduction in unsafe events from 39.5% to 23.8% while maintaining task success.

0 favorites 0 likes

#embodied-agents

WorldLines: Benchmarking and Modeling Long-Horizon Stateful Embodied Agents

arXiv cs.AI ↗ · 2026-06-18 Cached

WorldLines introduces a benchmark for long-horizon embodied household assistance, featuring memory QA and embodied task planning with partial observability, and proposes ObsMem, a visibility-aware memory framework.

0 favorites 0 likes

#embodied-agents

AgentSpec: Understanding Embodied Agent Scaffolds Through Controlled Composition

arXiv cs.CL ↗ · 2026-06-15 Cached

Introduces AgentSpec, a modular specification framework for systematically composing and analyzing embodied LLM agent scaffolds, revealing that performance depends on scaffold compatibility and interaction effects rather than isolated module strength.

0 favorites 0 likes

#embodied-agents

Efficient Skill Grounding via Code Refactoring with Small Language Models

arXiv cs.AI ↗ · 2026-06-09 Cached

This paper presents RECENT, a framework that enables efficient skill grounding in embodied agents using small language models (sLMs) by refactoring code-based skills rather than regenerating them from scratch, achieving performance comparable to LLM-based methods.

0 favorites 0 likes

#embodied-agents

Cosmos 3: Omnimodal World Models for Physical AI

Hugging Face Daily Papers ↗ · 2026-06-01 Cached

Cosmos 3 is a family of omnimodal world models from NVIDIA that jointly processes language, image, video, audio, and action sequences using a unified mixture-of-transformers architecture, achieving state-of-the-art performance in understanding and generation tasks for Physical AI.

0 favorites 0 likes

#embodied-agents

Personalizing Embodied Multimodal Large Language Model Agents over Long-term User Interactions

arXiv cs.AI ↗ · 2026-05-27 Cached

This paper proposes Polar, a multimodal memory-augmented framework for personalizing embodied MLLM agents over long-term user interactions, using a knowledge graph and episodic memory to ground user-intended instances from accumulated context.

0 favorites 0 likes

#embodied-agents

DexHoldem: Playing Texas Hold'em with Dexterous Embodied System

Hugging Face Daily Papers ↗ · 2026-05-18 Cached

DexHoldem is a real-world benchmark for evaluating embodied agents in dexterous manipulation tasks, using Texas Hold'em with a ShadowHand to test primitive execution, perception, and decision-making in a closed-loop setting.

0 favorites 0 likes

#embodied-agents

Ego2World: Compiling Egocentric Cooking Videos into Executable Worlds for Belief-State Planning

arXiv cs.AI ↗ · 2026-05-14 Cached

Ego2World converts egocentric cooking videos (HD-EPIC) into executable symbolic worlds with graph-transition rules, enabling evaluation of belief-state planning under partial observation. Experiments show that belief memory improves task completion, suggesting it should be a first-class target in embodied agent evaluation.

0 favorites 0 likes

#embodied-agents

Think Twice, Act Once: Verifier-Guided Action Selection For Embodied Agents

arXiv cs.AI ↗ · 2026-05-14 Cached

Proposes VeGAS, a test-time framework for MLLM-based embodied agents that samples multiple candidate actions and uses a generative verifier to select the most reliable, achieving up to 36% relative improvement over CoT baselines on challenging tasks.

0 favorites 0 likes

#embodied-agents

Continual Harness: Online Adaptation for Self-Improving Foundation Agents

Hugging Face Daily Papers ↗ · 2026-05-11 Cached

The paper introduces 'Continual Harness,' a framework enabling embodied AI agents to self-improve online without environment resets. It demonstrates significant progress in playing Pokémon games, achieving human-level performance through automated prompt and skill refinement.

0 favorites 0 likes

embodied-agents

Submit Feedback