dataset-visibility

Tag

Cards List
#dataset-visibility

Beyond Catalogue Counts: the Dataset Visibility Asymmetry in Low-Resource Multilingual NLP

arXiv cs.CL · 2026-05-19 Cached

This paper introduces the Resource Density Index (RDI) and uses LLM-assisted citation mining to reveal that many languages appear data-poor in catalogue records but have substantial dataset activity in research literature, highlighting a visibility asymmetry in low-resource multilingual NLP.

0 favorites 0 likes
← Back to home

Submit Feedback