factual-error-detection

Tag

Cards List
#factual-error-detection

An Empirical Analysis of Factual Errors in Human-Written Text and its Application

arXiv cs.CL · 6d ago Cached

The paper presents a taxonomy of factual errors in human-written text, derived from newspaper corrections, and evaluates LLMs' performance on detecting these errors, finding that even top models like GPT-5.4 achieve only 52% word-level F1 score, highlighting the task's difficulty.

0 favorites 0 likes
← Back to home

Submit Feedback