llm-reviews

#llm-reviews

Review Arcade: On the Human Alignment and Gameability of LLM Reviews

arXiv cs.AI ↗ · 2026-05-29 Cached

This paper empirically evaluates the alignment between LLM-generated and human reviews for scientific papers, finding limited and variable alignment. It also shows that authors can 'game' LLM reviews by iteratively revising papers to improve scores, with up to 35% of papers seeing statistically significant score increases.

0 favorites 0 likes

#llm-reviews

Review Arcade: On the Human Alignment and Gameability of LLM Reviews

Hugging Face Daily Papers ↗ · 2026-05-27 Cached

This paper investigates the alignment of LLM-generated reviews with human judgment using 1k real ACL 2025 submissions, finding limited agreement, instability across models/prompts, and a method to artificially inflate scores without meaningful changes. The authors advise against relying solely on LLM reviews and call for discussion on their use in handling increasing submission volumes.

0 favorites 0 likes

llm-reviews

Review Arcade: On the Human Alignment and Gameability of LLM Reviews

Review Arcade: On the Human Alignment and Gameability of LLM Reviews

Submit Feedback