GAMBIT: A Three-Mode Benchmark for Adversarial Robustness in Multi-Agent LLM Collectives

arXiv cs.CL 05/12/26, 04:00 AM Papers

Summary

This paper introduces GAMBIT, a benchmark for evaluating adversarial robustness in multi-agent LLM collectives, featuring adaptive imposters and recalibration modes to address the limitations of existing shallow evaluations.

arXiv:2605.09027v1 Announce Type: new Abstract: In multi-agent systems (MAS), a single deceptive agent can nullify all gains of an agentic AI collective and evade deployed defenses. However, existing adversarial studies on MAS target only shallow tasks and do not consider adaptive adversaries, which evolve their strategies to evade the very detectors trained to catch them. To address that gap, we introduce GAMBIT, a benchmark with three evaluation modes and two independent scores for evaluating imposter detectors: the first two modes measure zero-shot detection under increasing distribution shift, and a third recalibration mode measures how quickly a detector adapts to novel attacks from just 20 labeled examples. The benchmark comes with a dataset of 27,804 labeled instances spanning 240 co-evolved imposter strategies. Our contributions are threefold: (1) Using chess as a substrate deep reasoning problem and Gemini 3.1 Pro for agents, we release GAMBIT and its dataset to evaluate imposter detectors under realistic constraints against a stealthy adaptive imposter; (2) We introduce an adaptive imposter agent based on an efficient evolutionary framework, generalizable beyond chess, that collapses collective task performance while remaining essentially undetectable (50.5% F1-score with a Gemini-based detector); (3) We show that zero-shot evaluation can be highly misleading for adaptive adversaries: two detectors with near-identical zero-shot scores differ by 8x on few-shot adaptation, while the meta-learned variant converges 20x faster, a gap only visible in the recalibration mode. Altogether, GAMBIT provides the first multi-agent benchmark where adversarial attacks and defenses co-evolve, with an imposter framework generalizable beyond our use case, and promising techniques for fast recalibration in a rapidly evolving adversarial system. Code and data: https://anonymous.4open.science/r/gambit.

Original Article

View Cached Full Text

Cached at: 05/12/26, 07:08 AM

# GAMBIT: A Three-Mode Benchmark for Adversarial Robustness in Multi-Agent LLM Collectives
Source: [https://arxiv.org/abs/2605.09027](https://arxiv.org/abs/2605.09027)
[View PDF](https://arxiv.org/pdf/2605.09027)

> Abstract:In multi\-agent systems \(MAS\), a single deceptive agent can nullify all gains of an agentic AI collective and evade deployed defenses\. However, existing adversarial studies on MAS target only shallow tasks and do not consider adaptive adversaries, which evolve their strategies to evade the very detectors trained to catch them\. To address that gap, we introduce GAMBIT, a benchmark with three evaluation modes and two independent scores for evaluating imposter detectors: the first two modes measure zero\-shot detection under increasing distribution shift, and a third recalibration mode measures how quickly a detector adapts to novel attacks from just 20 labeled examples\. The benchmark comes with a dataset of 27,804 labeled instances spanning 240 co\-evolved imposter strategies\. Our contributions are threefold: \(1\) Using chess as a substrate deep reasoning problem and Gemini 3\.1 Pro for agents, we release GAMBIT and its dataset to evaluate imposter detectors under realistic constraints against a stealthy adaptive imposter; \(2\) We introduce an adaptive imposter agent based on an efficient evolutionary framework, generalizable beyond chess, that collapses collective task performance while remaining essentially undetectable \(50\.5% F1\-score with a Gemini\-based detector\); \(3\) We show that zero\-shot evaluation can be highly misleading for adaptive adversaries: two detectors with near\-identical zero\-shot scores differ by 8x on few\-shot adaptation, while the meta\-learned variant converges 20x faster, a gap only visible in the recalibration mode\. Altogether, GAMBIT provides the first multi\-agent benchmark where adversarial attacks and defenses co\-evolve, with an imposter framework generalizable beyond our use case, and promising techniques for fast recalibration in a rapidly evolving adversarial system\. Code and data:[this https URL](https://anonymous.4open.science/r/gambit)\.

## Submission history

From: Alexandre Le Mercier \[[view email](https://arxiv.org/show-email/b985ade1/2605.09027)\] **\[v1\]**Sat, 9 May 2026 16:07:23 UTC \(3,912 KB\)

GAMBIT: A Three-Mode Benchmark for Adversarial Robustness in Multi-Agent LLM Collectives

Similar Articles

Beyond Goodhart's Law: A Dynamic Benchmark for Evaluating Compliance in Multi-Agent Systems

Gate AI: LLM Security Benchmark Evaluation Methodology and Results

AgentCollabBench: Diagnosing When Good Agents Make Bad Collaborators

Robust Checkpoint Selection for Multimodal LLMs via Agentic Evaluation and Stability-Aware Ranking

CollabBench: Benchmarking and Unleashing Collaborative Ability of LLMs with Diverse Players via Proactive Engagement

Submit Feedback

Similar Articles

Beyond Goodhart's Law: A Dynamic Benchmark for Evaluating Compliance in Multi-Agent Systems

Gate AI: LLM Security Benchmark Evaluation Methodology and Results

AgentCollabBench: Diagnosing When Good Agents Make Bad Collaborators

Robust Checkpoint Selection for Multimodal LLMs via Agentic Evaluation and Stability-Aware Ranking

CollabBench: Benchmarking and Unleashing Collaborative Ability of LLMs with Diverse Players via Proactive Engagement