cooperative-task

Tag

Cards List
#cooperative-task

GPTNT: Benchmarking Real-Time Collaboration Between Multimodal Agents on Keep Talking And Nobody Explodes

arXiv cs.AI · yesterday Cached

The paper introduces GPTNT, a benchmark built on Keep Talking and Nobody Explodes that requires two multimodal agents to collaborate in real-time under time pressure and information asymmetry, revealing critical weaknesses in state-of-the-art systems.

0 favorites 0 likes
← Back to home

Submit Feedback