Tag
Proposes MAVIC, a method for multi-agent reinforcement learning that corrects value estimates at instruction boundaries to enable compliance with external natural language instructions while preserving base task performance.