SAVOIR: Learning Social Savoir-Faire via Shapley-based Reward Attribution

Hugging Face Daily Papers Papers

Summary

SAVOIR framework applies cooperative game theory and Shapley values to train language agents with improved social intelligence, achieving SOTA on SOTOPIA benchmark and matching GPT-4o performance.

Social intelligence, the ability to navigate complex interpersonal interactions, presents a fundamental challenge for language agents. Training such agents via reinforcement learning requires solving the credit assignment problem: determining how individual utterances contribute to multi-turn dialogue outcomes. Existing approaches directly employ language models to distribute episode-level rewards, yielding attributions that are retrospective and lack theoretical grounding. We propose SAVOIR (ShApley Value fOr SocIal RL), a novel principled framework grounded in cooperative game theory. Our approach combines two complementary principles: expected utility shifts evaluation from retrospective attribution to prospective valuation, capturing an utterance's strategic potential for enabling favorable future trajectories; Shapley values ensure fair credit distribution with axiomatic guarantees of efficiency, symmetry, and marginality. Experiments on the SOTOPIA benchmark demonstrate that SAVOIR achieves new state-of-the-art performance across all evaluation settings, with our 7B model matching or exceeding proprietary models including GPT-4o and Claude-3.5-Sonnet. Notably, even large reasoning models consistently underperform, suggesting social intelligence requires qualitatively different capabilities than analytical reasoning.
Original Article Export to Word Export to PDF
View Cached Full Text

Cached at: 04/23/26, 07:47 AM

Paper page - SAVOIR: Learning Social Savoir-Faire via Shapley-based Reward Attribution

Source: https://huggingface.co/papers/2604.18982 Authors:

,

,

,

,

,

,

,

,

,

,

Abstract

SAVOIR framework uses cooperative game theory to improve social intelligence in language agents by combining expected utility shifts and Shapley values for better credit assignment in dialogue systems.

Social intelligence, the ability to navigate complex interpersonal interactions, presents a fundamental challenge forlanguage agents. Training such agents viareinforcement learningrequires solving thecredit assignment problem: determining how individual utterances contribute to multi-turndialogue outcomes. Existing approaches directly employlanguage modelsto distributeepisode-level rewards, yielding attributions that are retrospective and lack theoretical grounding. We propose SAVOIR (ShApley Value fOr SocIal RL), a novel principled framework grounded incooperative game theory. Our approach combines two complementary principles:expected utility shiftsevaluation from retrospective attribution to prospective valuation, capturing an utterance’s strategic potential for enabling favorable future trajectories;Shapley valuesensure fair credit distribution with axiomatic guarantees of efficiency, symmetry, and marginality. Experiments on theSOTOPIA benchmarkdemonstrate that SAVOIR achieves new state-of-the-art performance across all evaluation settings, with our 7B model matching or exceeding proprietary models including GPT-4o and Claude-3.5-Sonnet. Notably, even large reasoning models consistently underperform, suggestingsocial intelligencerequires qualitatively different capabilities than analytical reasoning.

View arXiv pageView PDFGitHubAdd to collection

Get this paper in your agent:

hf papers read 2604\.18982

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2604.18982 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2604.18982 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2604.18982 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

Similar Articles

SkillOS: Learning Skill Curation for Self-Evolving Agents

Hugging Face Daily Papers

This paper introduces SkillOS, a reinforcement learning framework that enables LLM agents to learn long-term skill curation policies for self-evolution, improving performance and generalization across tasks.