Tag
This paper introduces Bot-Mod, a moderation framework that identifies malicious intent in multi-agent systems through multi-turn dialogue and Gibbs-based sampling, and presents a dataset from Moltbook for evaluation.