Tag
This paper formalizes the sim-to-real gap for foundation model agents as a Markov Decision Process problem, proposing a unified research agenda to adapt classical solutions like domain randomization for improving agent robustness and reliability in real-world deployment.
This paper revisits the Boolean Task Algebra (BTA) for zero-shot task composition in reinforcement learning, proving that in deterministic MDPs all optimal extended Q-functions collapse to just two components (universal and empty tasks), making the originally proposed logarithmic base task set redundant. The authors introduce a goal-set-based composition method that reduces learning costs and composition time while preserving policy performance across multiple experimental domains.