Tag
This paper revisits the Boolean Task Algebra (BTA) for zero-shot task composition in reinforcement learning, proving that in deterministic MDPs all optimal extended Q-functions collapse to just two components (universal and empty tasks), making the originally proposed logarithmic base task set redundant. The authors introduce a goal-set-based composition method that reduces learning costs and composition time while preserving policy performance across multiple experimental domains.