value-correction

Tag

Cards List
#value-correction

Macro-Action Based Multi-Agent Instruction Following through Value Cancellation

arXiv cs.AI · 2026-05-14 Cached

Proposes MAVIC, a method for multi-agent reinforcement learning that corrects value estimates at instruction boundaries to enable compliance with external natural language instructions while preserving base task performance.

0 favorites 0 likes
← Back to home

Submit Feedback