Tag
This paper finds that 42.6% of annotator disagreement in HateXplain concentrates at the hate/offensive boundary, demonstrating that majority vote silences minority values and leads to models being wrong but highly confident on contested inputs.