factual-error-detection

#factual-error-detection

An Empirical Analysis of Factual Errors in Human-Written Text and its Application

arXiv cs.CL ↗ · 6d ago Cached

The paper presents a taxonomy of factual errors in human-written text, derived from newspaper corrections, and evaluates LLMs' performance on detecting these errors, finding that even top models like GPT-5.4 achieve only 52% word-level F1 score, highlighting the task's difficulty.

0 favorites 0 likes

factual-error-detection

An Empirical Analysis of Factual Errors in Human-Written Text and its Application

Submit Feedback