Tag
SHALA-LLM is a reinforcement learning framework that enables LLMs to learn directly from annotator distributions and dynamically prioritize highly ambiguous samples during alignment, improving agreement with human label distributions and classification performance.
This paper introduces the eJSL Dialog dataset for emotion recognition in sign language conversations, addressing the lack of conversational context in existing datasets. Benchmarking shows a domain gap when applying generic multimodal models, highlighting the need for context-aware visual extractors for sign language.
This paper presents a multimodal emotion recognition module for proactive conversational agents, using facial recognition and linguistic analysis. A user study with 20 participants reveals a 'poker face' effect where visual cues are unreliable, while linguistic analysis proves more accurate; the study also shows agents can elicit emotions through conversational adaptation.
This paper proposes a plug-and-play module using self-paced curriculum learning to enhance modality balance in multimodal conversational emotion recognition, achieving consistent F1-score improvements on IEMOCAP and MELD datasets.
This paper proposes a lightweight framework using sticky factorial HDP-HMMs to model conversational emotion as latent regimes from multimodal valence-arousal trajectories, aiming for interpretable and computationally efficient emotional state tracking.
This article introduces EmoS, a high-fidelity multimodal benchmark designed for fine-grained streaming emotional understanding, addressing limitations in ecological validity and labeling reliability found in existing datasets.
Research paper examining how large language models express social emotions compared to human cultural norms, finding systematic misalignment where LLMs show inconsistent patterns of engaging vs. disengaging emotion expressivity across cultural personas (European American and Latin American) compared to human responses.