label-errors

Tag

Cards List
#label-errors

Revising RVL-CDIP: Quantifying Errors and Test-Train Overlap

arXiv cs.CL · yesterday Cached

This paper identifies and corrects label errors and test-train overlap in the RVL-CDIP document classification dataset, finding 12% label errors and 35% duplication. Correction improves classification accuracy and out-of-distribution generalization.

0 favorites 0 likes
← Back to home

Submit Feedback