Tag
This paper proposes a method to measure cultural localization in AI-generated stories, detecting that only a small fraction of vocabulary distinguishes nationalities while narratives rely on shared templates, and finds that cultural markers from many Global South countries are often offensive.
This paper audits six large language models for gender stereotyping across English, Korean, Chinese, and Japanese, anchoring against human baselines. It finds that LLM stereotyping often exceeds human cross-country variation and can compound across languages, introducing a four-pattern framework to characterize such behaviors.