Tag
This paper replicates the finding of 'emotion vectors' in open-weight LLMs Apertus-8B and Gemma-4-E4B, showing that valence geometry is recoverable across models with differences in layer emergence. The study also finds that arousal encoding is sensitive to the story corpus used for extraction.
This paper empirically tests the psychometric reliability of LLM-based user state classification, finding that only 31 of 213 metrics met reliability criteria, questioning trust in real-time adaptive systems.
This replication study evaluates DExperts for mitigating toxicity in LLMs, finding near-perfect safety against explicit toxicity but reduced effectiveness against implicit hate speech and a significant latency trade-off.