Measuring the Semantic Structure and Evolution of Conspiracy Theories

arXiv cs.CL Papers

Summary

This paper measures the semantic structure and evolution of conspiracy theories using 169.9M Reddit comments from r/politics (2012-2022), introducing the concept of "semantic objects" bounded by semantic neighborhoods to track how conspiracy theory meanings change over time beyond simple keyword-based approaches.

arXiv:2603.26062v2 Announce Type: replace Abstract: Research on conspiracy theories has largely focused on belief formation, exposure, and diffusion, while paying less attention to how their meanings change over time. This gap persists partly because conspiracy-related terms are often treated as stable lexical markers, making it difficult to separate genuine semantic changes from surface-level vocabulary changes. In this paper, we measure the semantic structure and evolution of conspiracy theories in online political discourse. Using 169.9M comments from Reddit's r/politics subreddit spanning 2012--2022, we first demonstrate that conspiracy-related language forms coherent and semantically distinguishable regions of language space, allowing conspiracy theories to be treated as semantic objects. We then track how these objects evolve over time using aligned word embeddings, enabling comparisons of semantic neighborhoods across periods. Our analysis reveals that conspiracy theories evolve non-uniformly, exhibiting patterns of semantic stability, expansion, contraction, and replacement that are not captured by keyword-based approaches alone.
Original Article
View Cached Full Text

Cached at: 04/20/26, 08:32 AM

# Measuring the Semantic Structure and Evolution of Conspiracy Theories

Source: https://arxiv.org/html/2603.26062

###### Abstract

Research on conspiracy theories largely focuses on belief formation, exposure, and diffusion, while paying less attention to how their meanings change over time. This gap persists partly because conspiracy-related terms are often treated as stable lexical markers, making it difficult to separate genuine semantic changes from surface-level vocabulary changes. In this paper, we measure the semantic structure and evolution of conspiracy theories in online political discourse. Using 169.9M comments from Reddit's r/politics subreddit spanning 2012–2022, we first demonstrate that conspiracy-related language forms coherent and semantically distinguishable regions of language space, allowing conspiracy theories to be treated as semantic objects. We then track how these objects evolved over time using aligned word embeddings, enabling comparisons of semantic neighborhoods across time. Our analysis reveals that conspiracy theories evolve non-uniformly, exhibiting patterns of semantic stability, expansion, contraction, and replacement that are not captured by keyword-based approaches alone.

## 1 Introduction

In November 2016, a gunman enters Comet Ping Pong pizzeria in Washington, D.C., USA, firing an AR-15 rifle to "self-investigate" claims that the restaurant harbors a child trafficking ring run by Democratic Party officials. The conspiracy theory that motivates this attack, "Pizzagate," begins as specific accusations about a single pizza restaurant connected to Hillary Clinton's campaign chairman. By 2020, however, "Pizzagate" becomes absorbed into the QAnon conspiracy theory, linking not just to child trafficking but to claims about a global cabal of elites, deep state actors, and institutional control. While the term "Pizzagate" remains constant across these years, its meaning—the concepts, actors, and narratives it encompasses—fundamentally transforms and fragments. This illustrates a fundamental challenge: disambiguating semantic evolution (changes in what a conspiracy theory means) from lexical churn (changes in the vocabulary used to express it).

While existing research predominantly focuses on who believes conspiracy theories, how they spread, and what psychological factors drive belief formation (Douglas et al. 2017; Del Vicario et al. 2016; Samory and Mitra 2018b), understanding how conspiracy theories change in meaning over time remains critically understudied. Traditional keyword-based approaches (Gulordava and Baroni 2011) track specific terms like "pizzagate" or "deep state," but cannot distinguish between three fundamentally different processes: (i) the term itself persists while its underlying meaning transforms, (ii) the meaning remains stable while the vocabulary changes entirely, and (iii) a single theory fragments into multiple interpretations referenced by the same term.

By drawing on distributional semantics, we move beyond isolated keywords to use **semantic neighborhoods**—the collections of terms that consistently co-occur with conspiracy theories and define what they actually mean in discourse. This approach allows us to define **semantic objects**: coherent regions bounded by semantic neighborhoods. We use these as stable units of analysis that represent a conspiracy theory's meaning at a given point in space and time. This allows us to track how these objects evolve, whether they remain stable, relocate in semantic space, expand or contract, or fragment into multiple meanings—while separately measuring vocabulary turnover, revealing evolutionary patterns that prior methods fundamentally cannot capture.

We develop and apply a **semantic object framework** to study conspiracy theory evolution in online political discourse. Using 169.9M comments from Reddit's r/politics spanning 2012–2022, we address two research questions:

- **RQ1. Is language associated with conspiratorial discourse semantically distinguishable from language used in non-conspiratorial discourse?** (Section 3) Specifically, we ask whether conspiracy-related discourse occupies a coherent and distinguishable region of linguistic expression that is meaningfully separable from non-conspiratorial discourse. We focus on establishing this property because semantic evolution can only be meaningfully studied if conspiratorial language forms a coherent and distinguishable semantic region within the discourse under study; without such structure, neither individual conspiracy theories nor their interactions over time can be well defined as they would be diffused over the entire language. We address this question by constructing semantic neighborhoods anchored on 19 distinct conspiracy theories and evaluating their coherence and boundaries using embedding-based cluster analysis and human expert annotations. This step establishes whether the semantic neighborhoods associated with conspiracy theories can be treated as semantic objects.

- **RQ2. How do the semantics and lexica associated with conspiracy theories evolve over time?** (Section 4) Based on the findings of RQ1, we address this question by building on prior research on diachronic word embeddings (Hamilton et al. 2016; Kulkarni et al. 2015) to create an analysis framework that allows comparisons of semantic objects, each object representing the semantic neighborhood of an individual conspiracy theory, across time periods. These comparisons allow us to reason directly about semantic changes over time without relying on vocabulary shifts alone, thus enabling accurate tracking and characterization of the evolutionary patterns associated with individual conspiracy theories. Alongside semantic shifts, we also track changes in the lexica to capture how the language used around these theories changes even when the underlying meaning remains stable.

Broadly, our findings show that conspiracy-related discourse forms coherent and semantically distinguishable clusters, validating the use of conspiracy-related semantic objects as units in our analysis framework. We also find that conspiracy theories evolve non-uniformly, with some retaining their semantic core over long periods, others either narrowing their semantics to reflect more specific theories or broadening their semantics by absorbing concepts from other theories, and a few undergoing drastic shifts which retain little of their past meanings. We find that many of these patterns are largely invisible to approaches that are focused on lexicon or keywords alone. We also observe that political scandal conspiracies underwent semantic replacement, whereas elite control conspiracies developed multiple, diverging narratives.

## 2 Data and Preprocessing

In this section, we describe our dataset, methods for the construction of temporal embeddings, and approach for identifying conspiracy theories and their themes.

### Identifying conspiracy theories and thematic concept labels

We focus on a fixed set of 19 conspiracy theories that were prominent during the period between 2012–2017 (cf. Table 1). While this is not an exhaustive set, it serves as a proof of concept. The ideas presented here can be extended to any arbitrary conspiracy. This choice allows us to focus on examining how these theories evolve throughout the three time periods in our study, as social and political dynamics change.

These conspiracy theories were curated based on prior academic literature (Mahl et al. 2021; Hanley et al. 2023; Samory and Mitra 2018a, b; Bessi et al. 2015a; Schabes 2020) and media reporting (Thomas 2025; Uscinski 2016). These theories include well-established U.S.-centric conspiracy theories (e.g., illuminati and chemtrails), event-driven conspiracies (e.g., related to the Sandy Hook shooting and Boston bombing), and squarely political conspiracies (e.g., Emailgate and Russiagate).

For each of these, we defined a **concept label**: a term that encapsulates the core theme of the theory and anchors it in discourse. These labels reflect how the specific theory is commonly referenced in online discussions. For example, the concept label 'deep state' refers to conspiracies alleging that a hidden network of government actors secretly controls U.S. policy. Each concept label serves as an entry point for identifying the surrounding discourse related to a conspiracy theory. This was inspired by prior work (Samory and Mitra 2018b, a) which described similar "overarching themes" that structured conspiratorial discourse in online communities. We treat these concept labels as anchors rather than exhaustive representations—i.e., we do not assume that a conspiracy theory is fully captured by these labels. Instead, these labels are used to identify and analyze the broader semantic neighborhoods in which conspiratorial discourse appears.

### Dataset

Our baseline dataset consists of 169.9M comments from the r/politics subreddit during the period between 2012 and 2022—a period that includes multiple highly contentious elections, economic instability, and a global pandemic. We select the r/politics subreddit for two main reasons. Firstly, it is the largest U.S. political discussion space on Reddit, with continuous engagement over the study period. Secondly, U.S.-related conspiracy theories frequently appear in comments on the subreddit, both directly or through rebuttal. Together, this makes it a suitable venue for studying discourse around conspiracy theories in the mainstream U.S. political context.

We intentionally avoid conspiracy-focused communities (e.g., r/conspiracy) because they represent niche communities where conspiracy theories are normalized and rarely critiqued or contested. Our goal is to understand the evolution of conspiracy theories in mainstream discourse while accounting for the articulation, rebuttal, and reframing of conspiracy narratives over time.

### Constructing temporal word embeddings

Prior to constructing embeddings, we preprocess each comment by removing URLs, eliminating stopwords, standardizing casing, and performing lemmatization. We construct word embeddings to capture the semantic relationships between words, enabling us to model how the meanings of terms and entire conspiracy theories evolve over time.

We also partition comments into three U.S.-specific political time periods: 2012–2014 (a period of fringe conspiratorial discourse marked by events such as Sandy Hook and "crisis actor"), 2015–2019 (mainstreamed and politically salient conspiracies related to the 2016 elections), and 2020–2022 (pandemic-driven conspiratorial narratives), each reflecting distinct contexts. We acknowledge that the time spans are not equal because our focus is on distinct political periods in U.S. politics. However, our method generalizes to any time period.

Next, to identify multi-word expressions, we use a conditional probability-based approach, motivated by the observation that many conspiratorial terms derive their meaning from short word combinations rather than individual words. We focus on bigrams because they are expressive enough to capture these meanings (e.g., false flag, crisis actor), while remaining frequent enough to allow statistical estimation. For each bigram (w₁, w₂), we compute the conditional probability Pr(w₂|w₁). We calculate a z-score for all bigrams, relative to the distribution of conditional probabilities, and keep those whose z-score exceeds 1.96 (approximately the 95th percentile under a normal approximation) as significant. The significant bigrams are collapsed into single tokens in every comment that contained them (e.g., the bigram false flag is collapsed into a single token false_flag). This process is performed recursively so that n-gram phrases can be identified.

Finally, for each time period, we construct independent word embeddings using the Word2Vec (Mikolov et al. 2013) architecture based on the processed comments from that time period. Specifically, we train a continuous bag-of-words (CBOW) model with an embedding dimensionality of 100, a context window size of 5, and a minimum token frequency threshold of 5. Thus, each embedding captures the semantic relationships between words (and identified phrases) as they were used within the specific temporal context, independently of how they occur in other time periods. We chose Word2Vec over more recent transformer-based models because Basile et al. (2020) found that diachronic embeddings perform better with static embeddings. These embeddings serve as the basis for both RQ1 and RQ2. In RQ1, we use them to identify and validate conspiracy-related semantic regions. In RQ2, we align these embeddings across periods and analyze how the semantic neighborhoods associated with each conspiracy theory evolves over time.

## 3 RQ1: Is Conspiratorial Language Semantically Distinguishable?

Here, we ask whether the language associated with conspiracy theories occupies a coherent and semantically distinguishable region within all online political discourse. Establishing this property is a prerequisite for studying the semantic evolution of our 19 conspiracy theories because without the presence of a well-defined semantic structure, individual conspiracy theories cannot be meaningfully represented, compared, or tracked over time.

To answer RQ1, we examine whether conspiracy-related discourse forms an internally coherent and externally separable region in the semantic space associated with all political discourse. RQ1 may yield a negative result. If conspiratorial discourse is semantically indistinguishable from all other political discourse, clustering around our concept labels would show low coherence and human validation does not align with measures of semantic distance. If either of these conditions occurs, we cannot identify semantic neighborhoods that represent individual conspiracy theories. Conversely, if we find coherent clusters around our concept labels and manually validated measures of semantic distance, this indicates that conspiratorial language is semantically distinguishable from other political discourse, and the semantic objects (clusters) around each concept label represent narratives surrounding each conspiracy theory.

### 3.1 Methods

Our goal is to determine whether each conspiracy theory occupies a localized and coherent region of the semantic space, as opposed to a diffuse or arbitrary collection of terms. Intuitively, if a conspiracy theory has a meaningful presence in discourse, the words and phrases used to discuss it should cluster together in the semantic space. To identify such regions, we treat each conspiracy theory's concept label as an anchor point and then ask: **what is the smallest semantic neighborhood that naturally forms around this label?** Rather than predefining the size of this neighborhood, we allow the structure of the embeddings to determine it.

Similar Articles

PolitNuggets: Benchmarking Agentic Discovery of Long-Tail Political Facts

arXiv cs.AI

PolitNuggets is a multilingual benchmark for evaluating large reasoning models within agentic frameworks on their ability to discover and synthesize long-tail political facts by constructing biographies for 400 global elites. The benchmark introduces evaluation protocols like FactNet and reveals that current systems struggle with fine-grained details and efficiency.

Polarization by Default: Auditing Recommendation Bias in LLM-Based Content Curation

arXiv cs.CL

This paper presents a large-scale audit of recommendation biases in LLM-based content curation across OpenAI, Anthropic, and Google using 540,000 simulated selections from Twitter/X, Bluesky, and Reddit data. The study finds that LLMs systematically amplify polarization, exhibit distinct toxicity handling trade-offs, and show significant political leaning bias favoring left-leaning authors despite right-leaning plurality in datasets.

A Community-Based Approach for Stance Distribution and Argument Organization

arXiv cs.CL

Researchers from the University of British Columbia propose an unsupervised graph-based system for organizing arguments from online debates by constructing interaction graphs and applying community detection to reveal diverse viewpoint distributions. The approach requires no training data and aims to help users navigate complex argumentative landscapes and combat filter bubbles.

Consistency Analysis of Sentiment Predictions using Syntactic & Semantic Context Assessment Summarization (SSAS)

arXiv cs.CL

This paper presents SSAS (Syntactic & Semantic Context Assessment Summarization), a framework designed to improve consistency in LLM-based sentiment prediction by reducing noise and variance through hierarchical classification and iterative summarization. Empirical evaluation on three industry-standard datasets shows up to 30% improvement in data quality and reliability for enterprise decision-making.