toxicity

Tag

Cards List
#toxicity

Measuring and Mitigating Toxicity in Large Language Models: A Comprehensive Replication Study

arXiv cs.CL · 2d ago Cached

This replication study evaluates DExperts for mitigating toxicity in LLMs, finding near-perfect safety against explicit toxicity but reduced effectiveness against implicit hate speech and a significant latency trade-off.

0 favorites 0 likes
#toxicity

Toxicity on Social Media – The Noisy Room

Hacker News Top · 5d ago Cached

A Stanford study analyzing billions of social media posts reveals that only ~3% of users generate severely toxic content, but engagement-driven algorithms disproportionately amplify this minority, distorting public perception and driving self-censorship among the majority.

0 favorites 0 likes
← Back to home

Submit Feedback