Tag
This paper investigates how LLMs produce different outcomes based on conversational context, finding that topic, rather than explicit user demographics, is the primary driver of disparities in high-stakes scenarios like salary advice.
This paper presents a large-scale analysis of four harmful language detection datasets, examining how annotator characteristics and linguistic features interact to influence annotation variation. It highlights intersectional effects and warns against generalizing findings across different datasets.