NeurIPS used uncalibrated AI detector for desk rejections [D]

Reddit r/MachineLearning News

Summary

A submission was desk-rejected from NeurIPS based on an uncalibrated AI detector (Pangram), raising concerns about circularity in the review process and unvalidated false-positive rates on the target distribution.

I recently had a submission desk-rejected from the NeurIPS 2026 Position Paper Track for an alleged AI-policy violation. After corresponding with the track leadership and reading their public blog post, I think the broader methodological issue is worth discussing here. The track used Pangram, a proprietary AI-text detector, as part of the desk-rejection process. I was told that the materials considered for desk rejection were: * the detector output * the authors’ AI-use attestation This creates a potential circularity problem. If a high detector score is used to judge the author’s attestation as inconsistent, and that inconsistency is then used to justify desk rejection, the detector is not just an aid. It becomes a decisive part of the adjudication process. The bigger issue is validation. The NeurIPS blog describes tests using Pangram audits, older ACM FAccT papers, synthetic AI-generated position papers, and manually edited samples. But the target population was NeurIPS 2026 Position Paper submissions, whose ground-truth authorship process is unknown. So the key question is: **What is the false-positive rate of the final decision procedure on the actual target distribution?** A false-positive rate measured on one distribution does not automatically transfer to another. If the actual submission pool produced a "surprisingly high flagged rate" (citation from NeurIPS blog post), that could indicate distribution shift / miscalibration. To sanity-check the detector’s behavior, I also ran Pangram on recent 2026 papers authored by NeurIPS Position Paper Track Chairs. Pangram returned scores including: * 69% AI * 45% AI * 36% AI * 24% AI I am **not** claiming those papers were AI-written. For me, Pangram’s outputs alone does not permit such a conclusion. And that is exactly the point.
Original Article

Similar Articles

Base Models Look Human To AI Detectors

arXiv cs.CL

This paper reveals that commercial AI detectors like GPTZero and Pangram judge text from base language models as overwhelmingly human, while instruction-tuned model outputs are flagged as AI-generated. The authors propose HIP, a detector-agnostic iterative paraphrasing pipeline that improves human-likeness while preserving semantics.

Sem-Detect: Semantic Level Detection of AI Generated Peer-Reviews

arXiv cs.CL

Sem-Detect introduces a method to distinguish AI-generated peer reviews from human-written ones by combining textual features with claim-level semantic analysis. It achieves a 25.5% improvement in true positive rate at 0.1% false positive rate over baselines, and shows that LLM-refined human reviews retain distinct semantic signals, with fewer than 3.5% misclassified as AI-generated.