hate-speech

Tag

Cards List
#hate-speech

Hate Speech Detection in Turkish and Arabic Languages: A Comprehensive Study

arXiv cs.CL · 2d ago Cached

Introduces a comprehensive hate speech dataset for Turkish and Arabic, and develops state-of-the-art BERT-based models for hate speech analysis including classification, intensity prediction, target identification, and span detection.

0 favorites 0 likes
#hate-speech

Majority Vote Silences Minority Values: Annotator Disagreement at the Hate/Offensive Boundary in HateXplain

arXiv cs.CL · 4d ago Cached

This paper finds that 42.6% of annotator disagreement in HateXplain concentrates at the hate/offensive boundary, demonstrating that majority vote silences minority values and leads to models being wrong but highly confident on contested inputs.

0 favorites 0 likes
#hate-speech

Simulating Hate Speech Cascades with Multi-LLM Agents: Empirical Grounding, Modeling Fidelity, and Intervention Strategies

arXiv cs.AI · 2026-06-18 Cached

This paper studies hate speech cascades on Bluesky and uses multi-LLM agents to simulate them, finding that such simulations reproduce key patterns like stance monoculture and toxicity-delta direction, and that amplifier targeting on dense networks yields 7.5–12.9% reduction in hateful content with low benign collateral.

0 favorites 0 likes
#hate-speech

Racist comments targeting politicians tripled since Meta relaxed its rules

Ars Technica · 2026-06-10 Cached

A new report from the Center for Countering Digital Hate (CCDH) reveals that racist comments targeting politicians tripled after Meta relaxed its content moderation rules, with violent threats and hate speech quadrupling and bullying doubling.

0 favorites 0 likes
#hate-speech

Meta Changed Its Speech Rules. Then Threats Against Politicians Skyrocketed

Wired · 2026-06-09 Cached

New research from the Center for Countering Digital Hate shows that abusive comments, violent threats, and hate speech against US lawmakers on Facebook tripled or quadrupled in the six months after Meta relaxed its speech rules in early 2025.

0 favorites 0 likes
#hate-speech

@elonmusk: Grok

X AI KOLs Following · 2026-05-26 Cached

Elon Musk highlights Grok's response to a user who copied Gemini's analysis of a Belgian hate speech conviction and asked Grok to reply.

0 favorites 0 likes
#hate-speech

Assisted Counterspeech Writing at the Crossroads of Hate Speech and Misinformation

arXiv cs.CL · 2026-05-22 Cached

This paper studies the use of large language models to assist expert counterspeech writing when hate speech and misinformation co-occur, testing knowledge-driven strategies with human evaluation. The mixed strategy combining fact-checkers' and NGOs' guidelines proved most effective.

0 favorites 0 likes
#hate-speech

Measuring and Mitigating Toxicity in Large Language Models: A Comprehensive Replication Study

arXiv cs.CL · 2026-05-15 Cached

This replication study evaluates DExperts for mitigating toxicity in LLMs, finding near-perfect safety against explicit toxicity but reduced effectiveness against implicit hate speech and a significant latency trade-off.

0 favorites 0 likes
#hate-speech

IYKYK (But AI Doesn't): Automated Content Moderation Does Not Capture Communities' Heterogeneous Attitudes Towards Reclaimed Language

arXiv cs.CL · 2026-04-21 Cached

Researchers from UCLA examine how automated content moderation tools, including Perspective API, fail to distinguish between reclaimed and hateful uses of slurs for LGBTQIA+, Black, and women communities. The study finds low inter-annotator agreement even among in-group members and poor alignment between community judgments and AI moderation tools, highlighting the need for context-sensitive approaches.

0 favorites 0 likes
← Back to home

Submit Feedback