Tag
This paper demonstrates that AI peer reviewers can be manipulated by modifying only presentation-level content (such as abstract, framing, and narrative) without changing any scientific evidence, achieving a 75.1% attack success rate. The authors introduce adversarial repackaging, a closed-loop attack that exploits AI reviewers' tendency to be impressed rather than convinced, and release a benchmark for testing robustness.