Tag
MindZero introduces a self-supervised reinforcement learning framework that trains multimodal large language models for efficient and robust online mental reasoning without requiring mental state annotations, outperforming model-based methods in accuracy and efficiency.
This paper introduces Differentiable Belief-based Opponent Shaping (D-BOS), a first-order method that treats observer beliefs as the shaped state and differentiates through belief update dynamics, allowing optimal strategies to emerge naturally from the environment's reward structure in hidden-role multi-agent settings.
OmniToM introduces a benchmark that evaluates large language models' theory of mind by requiring explicit belief structure extraction and labeling, revealing a bottleneck in tracking actor-specific beliefs despite strong performance on endpoint QA tasks.
Proposes Agent-ToM, a learning-to-monitor framework using Theory-of-Mind reasoning to detect covert malicious behavior in autonomous LLM agents by inferring beliefs and intents, outperforming baseline monitors.
This paper presents OSCToM, an RL-guided method for generating adversarial data to test nested belief conflicts in LLMs, improving Theory of Mind reasoning on benchmarks like FANToM.
This paper proposes a new interactive evaluation paradigm for Theory of Mind in LLMs, finding that improvements on static benchmarks do not translate to better performance in dynamic human-AI interactions, highlighting the need for interaction-based assessments.
This paper introduces the Instruction Inference task to evaluate Theory of Mind capabilities in LLM-based agents during human-agent collaboration with incomplete or ambiguous instructions. The authors present Tomcat, an LLM agent tested on GPT-4o, DeepSeek-R1, and Gemma-3-27B, demonstrating performance comparable to human participants in inferring unspoken intentions.
OpenAI and University of Oxford researchers present LOLA (Learning with Opponent-Learning Awareness), a reinforcement learning method that enables agents to model and account for the learning of other agents, discovering cooperative strategies in multi-agent games like the iterated prisoner's dilemma and coin game.