Building Social World Models with Large Language Models
Summary
The paper introduces the Social World Model (SWM) framework, which uses large language models to model the dynamics of social beliefs in response to events, without explicit annotations. It also presents a benchmark SWM-bench derived from prediction markets and shows state-of-the-art results.
View Cached Full Text
Cached at: 06/11/26, 09:36 PM
Paper page - Building Social World Models with Large Language Models
Source: https://huggingface.co/papers/2606.11482
Abstract
Social World Model framework captures evolution of social beliefs in response to events through temporal pattern mining and evidence lower bound optimization without explicit human annotations.
Understanding and predicting how social beliefs evolve in response to events -- from policy changes to scientific breakthroughs -- remains a fundamental challenge in social science. Given LLMs’ commonsense knowledge and social intelligence, we ask: Can LLMs model the dynamics of social beliefs following social events? In this work, we introduce the concept of theSocial World Model(SWM), a general framework designed to capture how social beliefs evolve in response to major events. SWM learnsstate-transition functionsfor social beliefs by miningtemporal patternsin social data and optimizing theevidence lower bound, without the need for explicit human annotations linking events to belief shifts, or for expensive census data. To evaluate SWM, we introduce a benchmark, SWM-bench, derived from real-worldprediction markets, specificallyKalshiandPolymarket. SWM-bench includes over 12k data points for social belief prediction tasks spanning diverse domains such as politics, finance, and cryptocurrency. Our experimental results show that SWM significantly outperforms time-series foundation models, achieving state-of-the-art results onKalshidata and demonstrating competitive performance onPolymarketdata, while offering interpretable insights into the underlying mechanisms ofsocial belief dynamics.
View arXiv pageView PDFGitHub10Add to collection
Get this paper in your agent:
hf papers read 2606\.11482
Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2606.11482 in a model README.md to link it from this page.
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2606.11482 in a dataset README.md to link it from this page.
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2606.11482 in a Space README.md to link it from this page.
Collections including this paper0
No Collection including this paper
Add this paper to acollectionto link it from this page.
Similar Articles
Large Language Models for Causal Relations Extraction in Social Media: A Validation Framework for Disaster Intelligence
This paper proposes a validation framework for using Large Language Models to extract causal relations from social media posts during disasters. It evaluates the effectiveness of LLMs in identifying cause-effect relationships and compares them against expert-grounded reference graphs to assess reliability and risks.
stable-worldmodel-v1: Reproducible World Modeling Research and Evaluation
Stable-Worldmodel (SWM) is a modular and standardized research framework for developing and evaluating world models, designed to improve reproducibility and support robustness and continual learning research.
Cultural Adaptation in Large Language Models for Political Discourse
This paper explores methods for adapting large language models to cultural contexts in political discourse, aiming to improve cross-cultural understanding and reduce bias.
Why We Need World Models for AGI: Where LLMs Fail and How World Models May Outperform
This paper argues that large language models struggle with causal reasoning and long-horizon planning due to a mismatch between sequence prediction and reasoning over latent environment dynamics, and introduces the Latent Dynamics Inference perspective along with the Flux environment to study these limitations.
Assessing Capabilities of Large Language Models in Social Media Analytics: A Multi-task Quest
Researchers from Utah State and Vanderbilt benchmark GPT-4, Gemini 1.5 Pro, DeepSeek-V3, Llama 3.2 and BERT on three social-media tasks—authorship verification, post generation, and user attribute inference—introducing new sampling protocols and taxonomies to reduce bias and enable reproducible benchmarks.