Tag
Introduces LoFa, a comprehensive benchmark to evaluate LLM robustness against logical fallacies in persuasive contexts, featuring a multi-agent pipeline and a multi-round debate framework.
This paper proposes a framework for fallacy classification that uses LLMs to extract patterns from fallacious examples and their explanations, achieving statistically significant improvements over zero-shot baselines and demonstrating cross-dataset generalization.