Tag
This paper identifies and corrects label errors and test-train overlap in the RVL-CDIP document classification dataset, finding 12% label errors and 35% duplication. Correction improves classification accuracy and out-of-distribution generalization.
This paper introduces a KAN-enhanced BiGRU architecture for classifying and summarizing multilingual legal documents from Bangladesh, achieving modest accuracy and ROUGE scores and demonstrating that the KAN block improves classification accuracy over the baseline BiGRU.
This systematic review of 139 studies proposes a unified framework and meta-analysis for document classification via multimodal and multiview information fusion, finding that fusion improves accuracy (mean gain of +5.28 percentage points) but highlights reproducibility challenges.