Tag
The paper introduces TickingCollabBench, a Minecraft-based multi-agent benchmark for time-sensitive complementary collaboration tasks with dynamic environments, and demonstrates that LLMs frequently fail under such conditions compared to a global-knowledge oracle.