Tag
This paper audits license provenance of over twenty African NLP corpus families, identifies compatibility failures like the JW300 violation and hidden NoDerivs clauses, and provides a due diligence checklist for legally clean dataset creation.