How Much Does Persuasion Strategy Matter? LLM-Annotated Evidence from Charitable Donation Dialogues
Summary
Researchers use three open-source LLMs to annotate 10,600 persuader turns in the PersuasionForGood corpus with 41 persuasion strategies, finding that strategy categories explain little donation variance and guilt induction significantly lowers donation rates.
View Cached Full Text
Cached at: 04/23/26, 10:02 AM
# How Much Does Persuasion Strategy Matter? LLM-Annotated Evidence from Charitable Donation Dialogues
Source: [https://arxiv.org/html/2604.19783](https://arxiv.org/html/2604.19783)
###### Abstract
Which persuasion strategies, if any, are associated with donation compliance? Answering this requires fine\-grained strategy labels across a full corpus and statistical tests corrected for multiple comparisons\. We annotate all 10,600 persuader turns in the 1,017\-dialogue PersuasionForGood corpus\(Wanget al\.,[2019](https://arxiv.org/html/2604.19783#biba.bib1)\), where donation outcomes are directly observable, with a taxonomy of 41 strategies in 11 categories, using three open\-source large language models \(LLMs; Qwen3:30b, Mistral\-Small\-3\.2, Phi\-4\)\. Strategy categories alone explain little variance in donation outcome \(pseudo
R2≈0\.015R^\{2\}\\approx 0\.015, consistent across all three annotators\)\. Guilt Induction is the only strategy significantly associated with*lower*donation rates \(
Δ≈−23\\Delta\\approx\-23percentage points\), an effect that replicates across all three models despite only moderate inter\-model agreement\. Reciprocity is the most robust positive correlate\. Target sentiment and interest predict whether a donation occurs but show at most a weak correlation with donation amount\. These findings suggest that strategy identification alone is insufficient to explain persuasion effectiveness, and that guilt\-based appeals may be counterproductive in prosocial settings\. We release the fully annotated corpus as a public resource\.
Keywords:persuasion strategies, donation dialogues, LLM annotation, sentiment analysis, PersuasionForGood
\\NAT@set@cites
How Much Does Persuasion Strategy Matter? LLM\-Annotated Evidence from Charitable Donation Dialogues
Tatiana Petrova††thanks:To appear inProceedings of the Workshop on Social Context and Integrating NLP and Psychology to Study Social Interactions \(SoCon\-NLPSI\), co\-located with the 15th Language Resources and Evaluation Conference \(LREC 2026\), Palma de Mallorca, Spain, May 2026\., Stanislav Sokol, Radu StateInterdisciplinary Centre for Security, Reliability and Trust \(SnT\)University of Luxembourg\{tatiana\.petrova, radu\.state\}@uni\.lu, stanislav\.sokol\.001@student\.uni\.luAbstract content
## 1\. Introduction
Charitable donation conversations are a natural setting for studying persuasion: a persuader attempts to convince a target to donate money, and the outcome \(donated or not, and how much\) is directly observable\. The PersuasionForGood corpus\(Wanget al\.,[2019](https://arxiv.org/html/2604.19783#biba.bib1)\)111We use the publicly available PersuasionForGood dataset\(PersuasionForGood\)\.provides 1,017 such dialogues collected via Amazon Mechanical Turk, where persuaders try to convince targets to donate part of their task earnings to Save the Children\. Understanding which strategies help or hinder can inform the design of prosocial dialogue systems and evidence\-based training for charitable fundraisers\.
Subsequent work has improved strategy classification accuracy\(Sahaet al\.,[2021](https://arxiv.org/html/2604.19783#biba.bib5)\)and analyzed target resistance\(Tianet al\.,[2020](https://arxiv.org/html/2604.19783#biba.bib6)\), but the question of which strategies, if any, are statistically associated with donation outcomes remains only partially addressed\.Wanget al\.\([2019](https://arxiv.org/html/2604.19783#biba.bib1)\)tested strategy–donation associations via logistic regression on 252 annotated dialogues with 10 strategies, finding only “Donation information” significant \(p<0\.05p<0\.05\) without multiple\-comparison correction\. The limited sample \(29% of the corpus\), 10\-strategy scheme, and absence of correction for multiple testing leave room for a more comprehensive analysis\. To our knowledge, no prior work has tested individual strategy–outcome associations across the full corpus with a fine\-grained taxonomy and multiple\-comparison corrections\.
We address this gap by \(1\) defining a hierarchical taxonomy of 41 persuasion strategies in 11 categories, grounded in Cialdini’s principles of influence\([1984](https://arxiv.org/html/2604.19783#biba.bib2)\)and Marwell and Schmitt’s compliance\-gaining strategies\([1967](https://arxiv.org/html/2604.19783#biba.bib3)\); \(2\) annotating all 10,600 persuader turns using three open\-source LLMs \(Qwen3:30b as primary annotator, Mistral\-Small\-3\.2 and Phi\-4 as robustness checks\) and all 10,332 target turns using Qwen3:30b alone \(for sentiment and interest labels\), following current best practice\(Carlson and Burbano,[2025](https://arxiv.org/html/2604.19783#biba.bib13); Abdurahmanet al\.,[2025](https://arxiv.org/html/2604.19783#biba.bib14)\); and \(3\) conducting bivariate tests with multiple\-comparison corrections and multivariate logistic regression to assess strategy–donation associations\. Our contributions include corpus\-scale evidence on strategy–donation associations \(including the Guilt Induction backfire and the limited predictive power of strategy categories\), a three\-model robustness design, and a fully annotated resource covering all 1,017 dialogues\.
## 2\. Related Work
#### PersuasionForGood and persuasion in NLP\.
Wanget al\.\([2019](https://arxiv.org/html/2604.19783#biba.bib1)\)introduced the corpus with 10 strategy labels \(e\.g\., logical appeal, emotional appeal, credibility appeal\) and an RCNN\-based classifier\.Sahaet al\.\([2021](https://arxiv.org/html/2604.19783#biba.bib5)\)improved classification with BERT\-based models, whileTianet al\.\([2020](https://arxiv.org/html/2604.19783#biba.bib6)\)analyzed target resistance strategies\.Chen and Yang \([2021](https://arxiv.org/html/2604.19783#biba.bib7)\)proposed weakly supervised identification of 8 persuasion strategies in online contexts\.
#### Persuasion strategy taxonomies\.
Our taxonomy builds on Cialdini’s\([1984](https://arxiv.org/html/2604.19783#biba.bib2)\)six principles of influence and Marwell and Schmitt’s\([1967](https://arxiv.org/html/2604.19783#biba.bib3)\)16 compliance\-gaining strategies, organizing 41 strategies into 11 categories to enable analysis at both category and individual strategy level \(Section 3\.1\)\.
#### LLM\-as\-annotator\.
Recent guidance recommends testing multiple models and treating model choice as a researcher degree of freedom\(Gilardiet al\.,[2023](https://arxiv.org/html/2604.19783#biba.bib4); Pangakiset al\.,[2024](https://arxiv.org/html/2604.19783#biba.bib10); Carlson and Burbano,[2025](https://arxiv.org/html/2604.19783#biba.bib13); Abdurahmanet al\.,[2025](https://arxiv.org/html/2604.19783#biba.bib14)\); we follow this practice \(Section 3\.2\)\.
## 3\. Methodology
### 3\.1\. Taxonomy
Starting from Cialdini’s\([1984](https://arxiv.org/html/2604.19783#biba.bib2)\)principles of influence and Marwell and Schmitt’s\([1967](https://arxiv.org/html/2604.19783#biba.bib3)\)compliance\-gaining strategies, supplemented by work on fear appeals, framing, and emotional manipulation, we compiled 45 candidate strategies\. Pilot annotation revealed that several were poorly distinguishable \(e\.g\., overlapping moral and value\-based appeals\)\. After iterative merging and refinement, we arrived at 41 persuasion strategies in 11 categories, plus 9 conversation management labels \(e\.g\., Greeting, Acknowledgement\)\. Each strategy has a textual definition, characteristic markers, and decision rules \(see released code\)\.
The most frequent persuasion category is Norms / Morality / Values \(n=1,331n=1\{,\}331, 12\.6% of all persuader turns\), followed by Rational / Impact Appeal \(9\.2%\) and Framing & Presentation \(7\.3%\)\. Overtly coercive strategies \(Threat / Pressure, Urgency / Scarcity\) are nearly absent, together accounting for only 0\.2% of turns\. Guilt Induction, while psychologically manipulative, is placed under Norms / Morality / Values because it operates through moral obligation rather than direct coercion\.
Figure 1:Distribution of 11 persuasion strategy categories \(Qwen3:30b\)\. Each bar shows the number of persuader turns assigned to the category; percentages indicate the category’s share of allN=10,600N\{=\}10\{,\}600persuader turns\. Conversation Management turns \(44\.0%\) are omitted\.
### 3\.2\. Annotation Procedure
We annotate all 10,600 persuader utterances across 1,017 dialogues using three open\-source LLMs deployed locally via Ollama:Qwen3:30b\(Alibaba; primary annotator\),Mistral\-Small\-3\.2\(Mistral AI\), andPhi\-4\(Microsoft, 14B\)\. These models were selected on three criteria\. First, they represent distinct developer families \(Alibaba, Mistral AI, Microsoft\), reducing the risk that findings reflect idiosyncratic biases of a single training pipeline, in line with the robustness\-check methodology ofCarlson and Burbano \([2025](https://arxiv.org/html/2604.19783#biba.bib13)\)andAbdurahmanet al\.\([2025](https://arxiv.org/html/2604.19783#biba.bib14)\): one model serves as the primary annotator and the others as independent replications\. Second, all three are fully open\-weight models deployable locally, ensuring that the annotation pipeline is fully reproducible without dependence on proprietary APIs\. Third, by including models of different parameter scales—Phi\-4 \(14B\) and Qwen3 \(30B\) as the smallest and largest – we can assess whether the larger primary annotator’s capacity drives results; Table[4](https://arxiv.org/html/2604.19783#S4.T4)shows that key effects hold across model sizes\. Qwen3:30b is designated as primary because it assigns valid labels to all 10,600 turns without errors, produces non\-degenerate distributions across all 11 categories \(unlike Mistral, which assigns<1%\{<\}1\\%to Framing & Presentation\), and achieves the highest macro\-level agreement with ’s gold standard\.
All three models use the same two\-step hierarchical prompt: the system message instructs the model to act as a “hierarchical persuasion strategy classification system”; the user message presents \(1\) the persuader utterance to classify, \(2\) up to 5 previous dialogue turns as context, and \(3\) the full category→\\tostrategy hierarchy with definitions\. The model first selects the parent category, then selects the specific strategy within that category, returning a structured JSON response\. Temperature is set to 0\.1 for near\-deterministic output\.
We also annotate all 10,332 target \(persuadee\) utterances using Qwen3:30b for two dimensions:*sentiment*\(negative / neutral / positive, coded as−1\-1,0,\+1\+1\) and*interest in donation*\(not interested / neutral / interested, coded as0,11,22\), capturing the target’s expressed engagement with the donation topic independently of affective tone: a turn may be affectively neutral yet indicate genuine curiosity about the charity or a willingness to consider donating\. We treat mean target sentiment and interest as*covariates*rather than primary predictors: both are measured*during*the conversation and thus reflect the target’s evolving response to persuasion rather than pre\-existing dispositions; causal direction between strategies, target responses, and donation cannot be established from observational data alone \(see Limitations\)\.
### 3\.3\. Annotation Quality
We evaluate annotation quality at two levels, following best practices for LLM\-based annotation\(Pangakiset al\.,[2024](https://arxiv.org/html/2604.19783#biba.bib10)\)\. No human inter\-annotator agreement study exists for a 41\-label persuasion strategy task; the expected human ceiling for this taxonomy is unknown\.
#### Cross\-taxonomy validation\.
We compare our annotations against[Wanget al\.](https://arxiv.org/html/2604.19783#biba.bib1)’s\([2019](https://arxiv.org/html/2604.19783#biba.bib1)\)gold standard on 300 dialogues \(3,047 turns\)\. Because the 41\-strategy and 10\-strategy schemes are structurally different, we map both to three macro\-categories \(*persuasive appeal*,*persuasive inquiry*,*non\-strategy*\), obtaining moderate agreement \(Cohen’sκ=0\.507\\kappa=0\.507, macroF1=0\.703F\_\{1\}=0\.703; “moderate” on theLandis and Koch[1977](https://arxiv.org/html/2604.19783#biba.bib9)scale\)\. This is lower than[Wanget al\.](https://arxiv.org/html/2604.19783#biba.bib1)’s humanα\>0\.70\\alpha\>0\.70but was obtained zero\-shot without task\-specific training\.
#### Expert verification\.
An expert in persuasion dialogue reviewed a stratified sample of 100 persuader turns covering all strategy labels annotated by the LLM, confirming correct classification in 84 of 100 cases \(84%\)\. The 16 disagreements predominantly involved boundary cases between semantically adjacent strategies \(e\.g\., Emotional Appeal vs\. Empathy Appeal, Moral Appeal vs\. Self\-feeling Appeal\)\. As this is verification rather than independent blind annotation, we report accuracy rather than Cohen’sκ\\kappa\.
#### Inter\-model agreement\.
On all 10,600 persuader turns, pairwise Cohen’sκ\\kappabetween the three models ranges from 0\.38 to 0\.54 at the strategy level \(“fair” to “moderate”\) and from 0\.44 to 0\.62 at the category level\. At the macro level \(persuasion vs\. conversation management\), agreement is higher:κ=0\.66\\kappa=0\.66–0\.750\.75, with raw agreement of 84–88%\. Three\-way exact match is 34\.1% for strategies and 47\.5% for categories\. The models diverge most on fine\-grained labels \(e\.g\., Rational Appeal vs\. Credibility Appeal\) but converge on functional classification\. The key downstream findings, in particular the Guilt Induction backfire effect, replicate across all three annotators \(Section[4\.7](https://arxiv.org/html/2604.19783#S4.SS7)\)\.
## 4\. Analysis and Results
We conduct analyses at two levels of granularity\. At thecategory level\(Section 4\.1\), we test whether the presence of each of the 11 strategy categories in a dialogue is associated with donation outcome\. At theindividual strategy level\(Sections 4\.2–4\.3\), we restrict tests to strategies appearing in at least 20 dialogues \(n≥20n\\geq 20\), a minimum\-frequency threshold adopted to ensure reliable chi\-square inference; 22 of 41 strategies meet this criterion \(full counts in Appendix[A](https://arxiv.org/html/2604.19783#A1)\)\. In both analyses we apply chi\-square tests with Bonferroni and Benjamini\-Hochberg \(FDR\) corrections for multiple comparisons, followed by multivariate logistic regression to assess independent effects \(Section[4\.5](https://arxiv.org/html/2604.19783#S4.SS5)\)\. Target sentiment and interest are included as covariates rather than primary predictors, as their role is detailed in Section 3\.2; bivariate associations with donation outcome are reported in Section 4\.4\.
The overall donation rate is 53\.6% \(545 / 1,017 dialogues\), with a mean donation of $2\.17 \($4\.05 among donors only\)\. Dialogues contain an average of∼10\{\\sim\}10persuader turns, typically employing 4–5 distinct persuasion strategies per dialogue \(mean=4\.4=4\.4,SD=1\.6SD=1\.6, range 0–10\); persuasion in this setting is thus multi\-strategy by nature\. All results below use Qwen3:30b \(primary annotator\) unless otherwise noted\.
### 4\.1\. Strategy Categories Do Not Independently Predict Donation
We test whether the presence of each strategy category in a dialogue is associated with donation outcome using chi\-square tests with Bonferroni correction for 11 comparisons\. None of the 11 strategy categories reach statistical significance \(allpBonf\>0\.05p\_\{Bonf\}\>0\.05\)\. The closest is Commitment / Consistency \(pBonf=0\.146p\_\{Bonf\}=0\.146\), followed by Rational / Impact Appeal \(pBonf=0\.280p\_\{Bonf\}=0\.280\)\. This null result replicates across all three annotators \(Table[4](https://arxiv.org/html/2604.19783#S4.T4)\)\.
Because dialogues are long \(avg\.∼10\{\\sim\}10turns\) and contain many strategies simultaneously, most categories appear in most dialogues \(Norms / Morality / Values in 72%, Rational / Impact Appeal in 56%\), making binary presence/absence a coarse signal\.
### 4\.2\. Guilt Induction is Associated with Lower Donation
At the individual strategy level, we test 22 strategies withn≥20n\\geq 20dialogues, applying both Bonferroni and Benjamini\-Hochberg \(FDR\) corrections\. Guilt Induction is the only strategy significantly associated with*lower*donation likelihood \(Table[1](https://arxiv.org/html/2604.19783#S4.T1)\)\. Dialogues containing Guilt Induction \(n=104n=104\) have a 32\.7% donation rate, compared to 56\.0% for dialogues without it \(n=913n=913;χ2=19\.4\\chi^\{2\}=19\.4,ϕ=0\.14\\phi=0\.14,pBonf<0\.001p\_\{Bonf\}<0\.001,pFDR<0\.001p\_\{FDR\}<0\.001\)\. The mean donation amount is also 3\.5×\\timeslower \($0\.67 vs\. $2\.34, Mann\-Whitney Up<0\.001p<0\.001\)\. This effect replicates across all three annotators:Δ=−23\.0\\Delta=\-23\.0pp with Mistral\-Small\-3\.2 and−23\.9\-23\.9pp with Phi\-4, both significant atpraw<0\.05p\_\{raw\}<0\.05\(Table[4](https://arxiv.org/html/2604.19783#S4.T4)\)\.
To illustrate, a typical Guilt Induction utterance from a non\-donated dialogue reads:*“Kids are dying from hunger every minute\. Don’t you want to help stop that?”*In contrast, a Reciprocity utterance from a donated dialogue:*“That’s great\! Every bit helps, I will match your donation myself\.”*Guilt induction threatens the target’s autonomy; reciprocity creates a mutual exchange\.
This pattern is consistent with psychological reactance theory\(Brehm,[1966](https://arxiv.org/html/2604.19783#biba.bib8)\): when individuals perceive that their freedom of choice is threatened, they resist rather than comply \(Figure[2](https://arxiv.org/html/2604.19783#S4.F2)a\)\.
This association is correlational and may partly reflect reverse causality \(see Discussion\)\. Our logistic regression \(Section[4\.5](https://arxiv.org/html/2604.19783#S4.SS5)\) shows that Guilt Induction remains a negative predictor \(odds ratio \[OR\]=0\.60=0\.60,p=0\.029p=0\.029\) even when controlling for target sentiment and interest\.
### 4\.3\. Reciprocity and Commitment/Consistency are Positive Predictors
Reciprocity is the strategy most robustly associated with*higher*donation rates: 72\.2% of dialogues containing Reciprocity \(n=72n=72\) result in donation, vs\. 52\.2% without \(n=945n=945;χ2=10\.0\\chi^\{2\}=10\.0,ϕ=0\.10\\phi=0\.10,pBonf=0\.034p\_\{Bonf\}=0\.034,pFDR=0\.016p\_\{FDR\}=0\.016\)\. The positive direction replicates in all three models \(Δ=\+5\.8\\Delta=\+5\.8to\+20\.1\+20\.1pp\), reaching significance in two of three \(Table[4](https://arxiv.org/html/2604.19783#S4.T4)\)\. Commitment and Consistency, while not significant under Bonferroni correction, also reaches significance under FDR \(pFDR=0\.016p\_\{FDR\}=0\.016; 63\.5% vs\. 51\.2%\)\. For both strategies, reverse causality cannot be excluded: persuaders may deploy them after the target has already signaled willingness\.
Pres\.Abs\.𝒑𝑩\\boldsymbol\{p\_\{B\}\}𝒑𝑭𝑫𝑹\\boldsymbol\{p\_\{FDR\}\}Strategy–donation \(22 tests,χ2\\chi^\{2\}\)Guilt32\.7%56\.0%<\.001\{<\}\.001<\.001\{<\}\.001Reciprocity72\.2%52\.2%\.034\.016Commit\./Cons\.63\.5%51\.2%\.049\.016Target response–donation \(MWU\)Don\.No don\.𝒑\\boldsymbol\{p\}Sentiment\.44\.44\.27\.27<\.001\{<\}\.001Interest1\.241\.241\.091\.09<\.001\{<\}\.001Target response–amount \(Pearson\)𝒓\\boldsymbol\{r\}𝒑\\boldsymbol\{p\}Sentiment\.034\.424Interest−\.001\{\-\}\.001\.981Table 1:Bivariate associations with donation outcome \(Qwen3:30b\)\.Top: donation rates \(%\) in dialogues where the strategy is present \(Pres\.\) vs\. absent \(Abs\.\);pBp\_\{B\}: Bonferroni,pFDRp\_\{FDR\}: Benjamini\-Hochberg correction over 22 tests\. Effect sizes are small \(ϕ=0\.14\\phi=0\.14for Guilt,0\.100\.10for Reciprocity\)\.Middle: mean target sentiment \(−1\-1to\+1\+1\) and interest \(0–22\) in donated \(Don\.\) vs\. not\-donated \(No don\.\) dialogues, Mann\-WhitneyUUtest\.Bottom: Pearsonrrbetween target response and donation amount \(donors only\)\. Cross\-model robustness is reported in Table[4](https://arxiv.org/html/2604.19783#S4.T4)\.Figure 2:\(a\) Donation rates in dialogues containing Guilt Induction \(n=104n\{=\}104\) vs\. dialogues without \(n=913n\{=\}913\); difference of−23\.3\-23\.3pp \(χ2=19\.4\\chi^\{2\}\{=\}19\.4,p<0\.001p<0\.001\)\. This effect replicates across all three annotators \(Table[4](https://arxiv.org/html/2604.19783#S4.T4)\)\. \(b\) Donation rate by predominant target sentiment across a dialogue’s target turns \(χ2\(2\)=44\.3\\chi^\{2\}\(2\)\{=\}44\.3,V=0\.21V\{=\}0\.21,p<0\.001p<0\.001\)\.
### 4\.4\. Target Sentiment and Interest Predict Donation but Not Amount
The target’s expressed sentiment and interest are the variables most strongly associated with whether a donation occurs \(bothp<0\.001p<0\.001; Table[1](https://arxiv.org/html/2604.19783#S4.T1), Figure[2](https://arxiv.org/html/2604.19783#S4.F2)b\)\. Donated dialogues have higher mean target sentiment \(0\.44 vs\. 0\.27 on a−1\-1to\+1\+1scale\) and higher mean interest \(1\.24 vs\. 1\.09 on a 0 to 2 scale\)\. At the turn level, donated dialogues contain more positive target turns \(49\.5% vs\. 38\.6%\) and fewer negative turns \(6\.1% vs\. 12\.0%\)\.
However, neither sentiment nor interest shows a statistically significant linear correlation with the donation*amount*among those who did donate \(Pearsonr=0\.034r=0\.034,p=0\.424p=0\.424andr=−0\.001r=\-0\.001,p=0\.981p=0\.981, respectively\)\. Spearman rank correlation detects a small monotonic association for sentiment \(ρ=0\.112\\rho=0\.112,p=0\.009p=0\.009\) but not for interest \(ρ=0\.059\\rho=0\.059,p=0\.172p=0\.172\), suggesting a weak non\-linear link between target sentiment and donation amount that the linear measure misses\. The “Negative” sentiment group contains only 23 dialogues \(3 donations\), so the 13\.0% rate in Figure[2](https://arxiv.org/html/2604.19783#S4.F2)b carries a wide confidence interval and should be interpreted with caution\.
### 4\.5\. Logistic Regression: Multivariate Analysis
The bivariate tests above examine each predictor in isolation\. To assess whether strategy effects survive when controlling for other predictors, we fit logistic regression models with donation \(binary\) as the dependent variable \(Table[2](https://arxiv.org/html/2604.19783#S4.T2)\)\.
#### Model 1: Strategy categories only\.
With the 9 strategy categories present in≥20\\geq 20dialogues as binary predictors, the model is significant overall \(log\-likelihood ratio \[LLR\]p=0\.011p=0\.011\) but explains little variance \(pseudoR2=0\.015R^\{2\}=0\.015\)\. Commitment / Consistency \(OR=1\.41=1\.41,p=0\.014p=0\.014\) and Rational / Impact Appeal \(OR=1\.36=1\.36,p=0\.018p=0\.018\) are positive predictors; no category is a significant negative predictor\.
#### Model 2: Categories \+ sentiment \+ interest\.
Adding mean target sentiment and interest improves fit substantially \(pseudoR2=0\.082R^\{2\}=0\.082; likelihood ratio test vs\. Model 1:χ2=94\.4\\chi^\{2\}=94\.4,p<10−6p<10^\{\-6\}\)\. Sentiment \(OR=4\.24=4\.24\) and interest \(OR=3\.11=3\.11\) yield the largest effects \(bothp<0\.001p<0\.001\)\. Commitment / Consistency loses significance \(p=0\.014→0\.454p=0\.014\\to 0\.454\), suggesting either confounding or mediation through target sentiment\. Call to Action emerges as a negative predictor \(OR=0\.70=0\.70,p=0\.009p=0\.009\)\.
#### Model 3: Parsimonious model \(exploratory\)\.
As an exploratory check, we fit a compact model with only Guilt Induction, Reciprocity, mean sentiment, and mean interest \(the variables with the strongest bivariate signals\)\. This model achieves nearly the same fit \(pseudoR2=0\.080R^\{2\}=0\.080, Akaike information criterion \[AIC\]=1302=1302, area under the ROC curve \[AUC\]=0\.67=0\.67\) as the full model \(AIC=1314=1314\)\. All four predictors are significant: sentiment \(OR=3\.94=3\.94,p<0\.001p<0\.001\), interest \(OR=2\.77=2\.77,p<0\.001p<0\.001\), Reciprocity \(OR=2\.41=2\.41,p=0\.002p=0\.002\), and Guilt Induction \(OR=0\.60=0\.60,p=0\.029p=0\.029\)\. For comparison, a categories\-only model achieves AUC=0\.58=0\.58\.
PredictorOR95% CI𝒑\\boldsymbol\{p\}Model 3 \(parsimonious\): pseudoR2=0\.080R^\{2\}=0\.080, AUC=0\.67=0\.67Sentiment3\.94\[2\.30, 6\.74\]<\.001\{<\}\.001Interest2\.77\[1\.64, 4\.67\]<\.001\{<\}\.001Reciprocity2\.41\[1\.38, 4\.20\]\.002Guilt Ind\.0\.60\[0\.37, 0\.95\]\.029Table 2:Logistic regression predicting donation \(binary\), parsimonious Model 3\. OR: odds ratio \(\>1\>1= higher donation probability\)\. Sentiment: mean target sentiment per dialogue \(−1\-1to\+1\+1\)\. Interest: mean target interest per dialogue \(0–22\)\.
### 4\.6\. Strategy–Response Sentiment Link
To explore why certain strategies are associated with donation, we pair each persuader turn carrying a persuasion strategy label with the immediately following target turn \(n=5,387n=5\{,\}387persuader–target pairs\) and compute the mean target sentiment elicited by each strategy \(Table[3](https://arxiv.org/html/2604.19783#S4.T3)\)\. The corpus\-wide average response sentiment is\+0\.37\+0\.37\. Guilt Induction \(\+0\.02\+0\.02\) elicits responses far below the corpus average, as do Fear Appeal \(\+0\.19\+0\.19\) and Unity \(\+0\.23\+0\.23\)\. In contrast, Commitment and Consistency \(\+0\.54\+0\.54\), Reciprocity \(\+0\.51\+0\.51\), and Foot\-in\-the\-door \(\+0\.51\+0\.51\) elicit the most positive responses\. Reciprocity co\-occurs with positive engagement; the affective response pattern points to a possible mediating role in the strategy–donation link, though formal mediation analysis would be needed to confirm this\.
StrategynMean sent\.% neg\.Bottom 3 \(lowest mean sentiment\)Guilt Induction114\+0\.02\+0\.0227\.2%Fear Appeal69\+0\.19\+0\.1921\.7%Unity43\+0\.23\+0\.2323\.3%Top 3 \(highest mean sentiment\)Commit\. & Cons\.182\+0\.54\+0\.547\.1%Reciprocity67\+0\.51\+0\.517\.5%Foot\-in\-the\-door41\+0\.51\+0\.517\.3%Corpus avg\.5,387\+0\.37\+0\.37—Table 3:Mean target sentiment \(−1\-1to\+1\+1\) in the turn immediately following each persuader strategy \(n=5,387n=5\{,\}387persuader–target pairs\)\. Only persuasion strategies withn≥20n\\geq 20pairs shown; Conversation Management labels excluded\. % neg\.: proportion of negative target responses\.
### 4\.7\. Cross\-Model Robustness
To assess whether our findings depend on the choice of annotator, we replicate the full analysis pipeline with Mistral\-Small\-3\.2 and Phi\-4 annotations \(Table[4](https://arxiv.org/html/2604.19783#S4.T4)\)\. The Guilt Induction backfire effect is the most robust finding: the effect direction and magnitude \(Δ≈−23\\Delta\\approx\-23pp\) are consistent across all three model families, despite the models identifying different numbers of Guilt turns \(Qwen: 104 dialogues, Mistral: 35, Phi\-4: 36\) and achieving only moderate pairwise agreement \(κ=0\.38\\kappa=0\.38–0\.540\.54\)\. To probe whether this consistency reflects a core set of “obvious” guilt turns, we examine the intersection: only 16 dialogues are flagged by all three models \(Jaccard=0\.14=0\.14\), and 63 are flagged by Qwen alone\. The donation rate is low for both subsets \(37\.5% for all\-three\-agree, 30\.2% for Qwen\-only; Fisherp=0\.56p=0\.56for the difference\), indicating that the effect is not confined to extreme cases; even borderline guilt turns identified by a single model are associated with lower donation rates \(any\-guilt union: 29\.9% vs\. no\-guilt: 56\.7%,p<0\.001p<0\.001\)\. The positive Reciprocity association replicates in direction across all three models and reaches significance in two of three\. The null result for categories \(pseudoR2≈0\.015R^\{2\}\\approx 0\.015\) and the dominance of sentiment and interest in the full model \(pseudoR2≈0\.08R^\{2\}\\approx 0\.08\) are stable across all annotators\.
Table 4:Cross\-model robustness\.Δ\\Delta: difference in donation rate \(pp\) between dialogues with vs\. without the strategy\.R2R^\{2\}: McFadden pseudoR2R^\{2\}\. Significance:p∗<\.05\{\}^\{\*\}p\{<\}\.05,p∗∗<\.01\{\}^\{\*\*\}p\{<\}\.01,p∗∗∗<\.001\{\}^\{\*\*\*\}p\{<\}\.001\(uncorrectedprawp\_\{raw\}; multiple\-comparison correction is applied only to the primary Qwen model in Table[1](https://arxiv.org/html/2604.19783#S4.T1)\)\. Repl\.: models where the finding reaches significance \(prawp\_\{raw\}\) or matches direction \(forR2R^\{2\}\)\.
## 5\. Discussion and Conclusion
Our results indicate that persuasion effectiveness cannot be reduced to “strategy X leads to donation\.”
First, strategy categories have limited predictive power \(pseudoR2=0\.011R^\{2\}=0\.011–0\.0160\.016across all three annotators\), challenging the assumption that strategy identification alone captures persuasion effectiveness\. With 4–5 strategies per dialogue, binary presence/absence is inherently coarse; future work should model strategy*sequences*and*combinations*\.
The moderate inter\-model agreement \(κ=0\.38\\kappa=0\.38–0\.540\.54\) is expected for a 41\-label zero\-shot task with many semantically adjacent strategy pairs\. Key findings replicate regardless of this disagreement \(Table[4](https://arxiv.org/html/2604.19783#S4.T4)\), while weaker effects, notably Reciprocity \(significant in two of three models\), should be interpreted with more caution\. FollowingCarlson and Burbano \([2025](https://arxiv.org/html/2604.19783#biba.bib13)\), effects replicating across all three annotators provide stronger evidence than single\-model results\.
Second, the Guilt Induction backfire effect has a practical implication: prosocial dialogue systems may benefit from avoiding guilt\-based appeals\. Temporal analysis supports this: splitting the 104 guilt\-containing dialogues by the position of the first guilt turn, donation rates decline monotonically from early \(45\.9%45\.9\\%,n=37n\{=\}37\) through mid \(34\.3%34\.3\\%,n=35n\{=\}35\) to late guilt \(15\.6%15\.6\\%,n=32n\{=\}32\), compared to the no\-guilt baseline of56\.0%56\.0\\%\(χ2\(3\)=26\.7\\chi^\{2\}\(3\)=26\.7,p<0\.001p<0\.001\)\. After Bonferroni correction, only late guilt differs significantly from the baseline \(padj<0\.001p\_\{adj\}<0\.001\)\. This pattern may reflect reverse causality \(persuaders may resort to guilt after sensing resistance\), but even early guilt underperforms the baseline numerically\.
#### Resource contribution\.
We release the full annotated dataset \(20,932 turns across 1,017 dialogues\): strategy labels from all three annotators for all 10,600 persuader turns and sentiment/interest labels for all 10,332 target turns, together with prompt templates, analysis scripts, and validation code\.222Code and data:[https://github\.com/persuasion\-nlp/persuasion\-strategies](https://github.com/persuasion-nlp/persuasion-strategies)The full taxonomy is in Appendix[A](https://arxiv.org/html/2604.19783#A1)\.
## 6\. Limitations
Our LLM annotations are produced without fine\-tuning; inter\-model agreement on fine\-grained labels is moderate \(κ=0\.38\\kappa=0\.38–0\.540\.54\), and annotation noise may attenuate downstream estimates\. Key findings \(Guilt backfire, low categoryR2R^\{2\}\) are robust to annotator choice, but weaker effects \(e\.g\., Reciprocity\) vary across models\. Each turn receives one label, though turns may contain multiple strategies; forced single\-label annotation may undercount co\-occurring strategies\. Sentiment and interest are measured*during*the conversation, so they function as concurrent mediators rather than exogenous predictors; causal mediation analysis would be needed to disentangle strategy effects from target response effects\. The released data lacks individual worker identifiers, so non\-independence across dialogues cannot be ruled out\. Finally, all models explain at most 8% of variance \(pseudoR2=0\.08R^\{2\}=0\.08\); all reported associations are correlational\. Expert verification was conducted by a single annotator with expertise in persuasion and dialogue research; while the 84% accuracy on a stratified 100\-turn sample suggests acceptable label quality, a multi\-annotator blind evaluation would yield a more reliable estimate and is left for future work\.
## 7\. Ethics Statement
This work analyzes existing publicly available dialogue data\(Wanget al\.,[2019](https://arxiv.org/html/2604.19783#biba.bib1)\)\. No new human subjects data was collected\. The persuasion strategies we study are from cooperative charitable donation contexts\. Findings about persuasion strategy effectiveness could theoretically inform manipulative applications; however, the primary intended use is understanding human persuasion dynamics and improving prosocial dialogue systems\.
## List of Abbreviations
AICAkaike Information CriterionAUCArea Under the ROC CurveCIConfidence IntervalFDRFalse Discovery RateLLMLarge Language ModelLLRLog\-Likelihood RatioMWUMann\-WhitneyUU\(test\)OROdds Ratiopppercentage points
## 8\. Bibliographical References
- S\. Abdurahman, A\. Salkhordeh Ziabari, A\. K\. Moore, D\. M\. Bartels, and M\. Dehghani \(2025\)A primer for evaluating large language models in social\-science research\.Advances in Methods and Practices in Psychological Science8\(2\)\.External Links:[Document](https://dx.doi.org/10.1177/25152459251325174)Cited by:[§1](https://arxiv.org/html/2604.19783#S1.p3.1),[§2](https://arxiv.org/html/2604.19783#S2.SS0.SSS0.Px3.p1.1),[§3\.2](https://arxiv.org/html/2604.19783#S3.SS2.p1.1)\.
- J\. W\. Brehm \(1966\)A theory of psychological reactance\.Academic Press\.Cited by:[§4\.2](https://arxiv.org/html/2604.19783#S4.SS2.p3.1)\.
- N\. A\. Carlson and V\. Burbano \(2025\)The use of LLMs to annotate data in management research: foundational guidelines and warnings\.Strategic Management Journal\.Note:Early viewExternal Links:[Document](https://dx.doi.org/10.1002/smj.70023)Cited by:[§1](https://arxiv.org/html/2604.19783#S1.p3.1),[§2](https://arxiv.org/html/2604.19783#S2.SS0.SSS0.Px3.p1.1),[§3\.2](https://arxiv.org/html/2604.19783#S3.SS2.p1.1),[§5](https://arxiv.org/html/2604.19783#S5.p3.2)\.
- W\. Chen and D\. Yang \(2021\)Weakly supervised persuasion strategy identification in online persuasion\.InProceedings of the AAAI Conference on Artificial Intelligence,Vol\.35,pp\. 12704–12711\.Cited by:[§2](https://arxiv.org/html/2604.19783#S2.SS0.SSS0.Px1.p1.1)\.
- R\. B\. Cialdini \(1984\)Influence: the psychology of persuasion\.William Morrow\.Cited by:[§1](https://arxiv.org/html/2604.19783#S1.p3.1),[§2](https://arxiv.org/html/2604.19783#S2.SS0.SSS0.Px2.p1.1),[§3\.1](https://arxiv.org/html/2604.19783#S3.SS1.p1.1)\.
- F\. Gilardi, M\. Alizadeh, and M\. Kubli \(2023\)ChatGPT outperforms crowd workers for text\-annotation tasks\.Proceedings of the National Academy of Sciences120\(30\),pp\. e2305016120\.Cited by:[§2](https://arxiv.org/html/2604.19783#S2.SS0.SSS0.Px3.p1.1)\.
- J\. R\. Landis and G\. G\. Koch \(1977\)The measurement of observer agreement for categorical data\.Biometrics33\(1\),pp\. 159–174\.Cited by:[§3\.3](https://arxiv.org/html/2604.19783#S3.SS3.SSS0.Px1.p1.3)\.
- G\. Marwell and D\. R\. Schmitt \(1967\)Dimensions of compliance\-gaining behavior: an empirical analysis\.Sociometry30\(4\),pp\. 350–364\.Cited by:[§1](https://arxiv.org/html/2604.19783#S1.p3.1),[§2](https://arxiv.org/html/2604.19783#S2.SS0.SSS0.Px2.p1.1),[§3\.1](https://arxiv.org/html/2604.19783#S3.SS1.p1.1)\.
- N\. Pangakis, S\. Wolken, and N\. Fasching \(2024\)Automated annotation with generative AI requires validation\.arXiv preprint arXiv:2306\.00176\.Cited by:[§2](https://arxiv.org/html/2604.19783#S2.SS0.SSS0.Px3.p1.1),[§3\.3](https://arxiv.org/html/2604.19783#S3.SS3.p1.1)\.
- T\. Saha, S\. R\. Jayashree, S\. Saha, and P\. Bhattacharyya \(2021\)Towards modeling the style of persuasive strategies in persuasive dialogues\.InFindings of the Association for Computational Linguistics: ACL\-IJCNLP 2021,pp\. 3873–3883\.Cited by:[§1](https://arxiv.org/html/2604.19783#S1.p2.1),[§2](https://arxiv.org/html/2604.19783#S2.SS0.SSS0.Px1.p1.1)\.
- Y\. Tian, W\. Shi, C\. Li, and Z\. Yu \(2020\)Understanding user resistance strategies in persuasive conversations\.InFindings of the Association for Computational Linguistics: EMNLP 2020,pp\. 4799–4808\.Cited by:[§1](https://arxiv.org/html/2604.19783#S1.p2.1),[§2](https://arxiv.org/html/2604.19783#S2.SS0.SSS0.Px1.p1.1)\.
- X\. Wang, W\. Shi, R\. Kim, Y\. Oh, S\. Yang, J\. Zhang, and Z\. Yu \(2019\)Persuasion for good: towards a personalized persuasive dialogue system for social good\.InProceedings of the 57th Annual Meeting of the Association for Computational Linguistics,pp\. 5635–5649\.Cited by:[§1](https://arxiv.org/html/2604.19783#S1.p1.1),[§1](https://arxiv.org/html/2604.19783#S1.p2.1),[§2](https://arxiv.org/html/2604.19783#S2.SS0.SSS0.Px1.p1.1),[§3\.2](https://arxiv.org/html/2604.19783#S3.SS2.p1.1),[§3\.3](https://arxiv.org/html/2604.19783#S3.SS3.SSS0.Px1.p1.3),[§7](https://arxiv.org/html/2604.19783#S7.p1.1)\.
## 9\. Language Resource References
- S\. Abdurahman, A\. Salkhordeh Ziabari, A\. K\. Moore, D\. M\. Bartels, and M\. Dehghani \(2025\)A primer for evaluating large language models in social\-science research\.Advances in Methods and Practices in Psychological Science8\(2\)\.External Links:[Document](https://dx.doi.org/10.1177/25152459251325174)Cited by:[§1](https://arxiv.org/html/2604.19783#S1.p3.1),[§2](https://arxiv.org/html/2604.19783#S2.SS0.SSS0.Px3.p1.1),[§3\.2](https://arxiv.org/html/2604.19783#S3.SS2.p1.1)\.
- J\. W\. Brehm \(1966\)A theory of psychological reactance\.Academic Press\.Cited by:[§4\.2](https://arxiv.org/html/2604.19783#S4.SS2.p3.1)\.
- N\. A\. Carlson and V\. Burbano \(2025\)The use of LLMs to annotate data in management research: foundational guidelines and warnings\.Strategic Management Journal\.Note:Early viewExternal Links:[Document](https://dx.doi.org/10.1002/smj.70023)Cited by:[§1](https://arxiv.org/html/2604.19783#S1.p3.1),[§2](https://arxiv.org/html/2604.19783#S2.SS0.SSS0.Px3.p1.1),[§3\.2](https://arxiv.org/html/2604.19783#S3.SS2.p1.1),[§5](https://arxiv.org/html/2604.19783#S5.p3.2)\.
- W\. Chen and D\. Yang \(2021\)Weakly supervised persuasion strategy identification in online persuasion\.InProceedings of the AAAI Conference on Artificial Intelligence,Vol\.35,pp\. 12704–12711\.Cited by:[§2](https://arxiv.org/html/2604.19783#S2.SS0.SSS0.Px1.p1.1)\.
- R\. B\. Cialdini \(1984\)Influence: the psychology of persuasion\.William Morrow\.Cited by:[§1](https://arxiv.org/html/2604.19783#S1.p3.1),[§2](https://arxiv.org/html/2604.19783#S2.SS0.SSS0.Px2.p1.1),[§3\.1](https://arxiv.org/html/2604.19783#S3.SS1.p1.1)\.
- F\. Gilardi, M\. Alizadeh, and M\. Kubli \(2023\)ChatGPT outperforms crowd workers for text\-annotation tasks\.Proceedings of the National Academy of Sciences120\(30\),pp\. e2305016120\.Cited by:[§2](https://arxiv.org/html/2604.19783#S2.SS0.SSS0.Px3.p1.1)\.
- J\. R\. Landis and G\. G\. Koch \(1977\)The measurement of observer agreement for categorical data\.Biometrics33\(1\),pp\. 159–174\.Cited by:[§3\.3](https://arxiv.org/html/2604.19783#S3.SS3.SSS0.Px1.p1.3)\.
- G\. Marwell and D\. R\. Schmitt \(1967\)Dimensions of compliance\-gaining behavior: an empirical analysis\.Sociometry30\(4\),pp\. 350–364\.Cited by:[§1](https://arxiv.org/html/2604.19783#S1.p3.1),[§2](https://arxiv.org/html/2604.19783#S2.SS0.SSS0.Px2.p1.1),[§3\.1](https://arxiv.org/html/2604.19783#S3.SS1.p1.1)\.
- N\. Pangakis, S\. Wolken, and N\. Fasching \(2024\)Automated annotation with generative AI requires validation\.arXiv preprint arXiv:2306\.00176\.Cited by:[§2](https://arxiv.org/html/2604.19783#S2.SS0.SSS0.Px3.p1.1),[§3\.3](https://arxiv.org/html/2604.19783#S3.SS3.p1.1)\.
- T\. Saha, S\. R\. Jayashree, S\. Saha, and P\. Bhattacharyya \(2021\)Towards modeling the style of persuasive strategies in persuasive dialogues\.InFindings of the Association for Computational Linguistics: ACL\-IJCNLP 2021,pp\. 3873–3883\.Cited by:[§1](https://arxiv.org/html/2604.19783#S1.p2.1),[§2](https://arxiv.org/html/2604.19783#S2.SS0.SSS0.Px1.p1.1)\.
- Y\. Tian, W\. Shi, C\. Li, and Z\. Yu \(2020\)Understanding user resistance strategies in persuasive conversations\.InFindings of the Association for Computational Linguistics: EMNLP 2020,pp\. 4799–4808\.Cited by:[§1](https://arxiv.org/html/2604.19783#S1.p2.1),[§2](https://arxiv.org/html/2604.19783#S2.SS0.SSS0.Px1.p1.1)\.
- X\. Wang, W\. Shi, R\. Kim, Y\. Oh, S\. Yang, J\. Zhang, and Z\. Yu \(2019\)Persuasion for good: towards a personalized persuasive dialogue system for social good\.InProceedings of the 57th Annual Meeting of the Association for Computational Linguistics,pp\. 5635–5649\.Cited by:[§1](https://arxiv.org/html/2604.19783#S1.p1.1),[§1](https://arxiv.org/html/2604.19783#S1.p2.1),[§2](https://arxiv.org/html/2604.19783#S2.SS0.SSS0.Px1.p1.1),[§3\.2](https://arxiv.org/html/2604.19783#S3.SS2.p1.1),[§3\.3](https://arxiv.org/html/2604.19783#S3.SS3.SSS0.Px1.p1.3),[§7](https://arxiv.org/html/2604.19783#S7.p1.1)\.
## Appendix AFull Strategy Taxonomy
Table[5](https://arxiv.org/html/2604.19783#A1.T5)lists all 41 persuasion strategies and 9 conversation management labels with turn counts\.
CategoryStrategyn%Norms / Morality / ValuesAppeal to Values5435\.1Moral Appeal5214\.9Guilt Induction1291\.2Self\-feeling Appeal1381\.3Rational / Impact AppealRational Appeal9288\.8Logical Appeal450\.4Framing & PresentationFraming7517\.1Loss Aversion Appeal180\.2Bait\-and\-switch1<<0\.1Pretexting1<<0\.1Authority / ExpertiseCredibility Appeal6836\.4Expertise70\.1Authority70\.1Emotional InfluenceEmpathy Appeal2652\.5Storytelling1801\.7Emotional Appeal760\.7Fear Appeal730\.7Sympathy Appeal400\.4Emotional Manipulation3<<0\.1Call to ActionCall to Action6356\.0Liking110\.1Commitment / ConsistencyCommitment and Consistency2322\.2Activ\. of Personal Commitment950\.9Foot\-in\-the\-door450\.4Door\-in\-the\-face170\.2Social InfluenceSocial Proof1911\.8Unity530\.5Social Positioning3<<0\.1Exchange / IncentivesReciprocity880\.8Rewarding Activity710\.7Pre\-giving570\.5Debt6<<0\.1Urgency / ScarcityUrgency170\.2Scarcity2<<0\.1Threat / PressureThreat5<<0\.1Aversive Stimulation2<<0\.1Conversation ManagementGreeting / Rapport1,06010\.0Acknowledgement1,0229\.6Charity Awareness Probe7927\.5Non\-persuasive Other6085\.7Logistics / Coordination3503\.3Donation Baseline / Habit Probe3253\.1Conversation Closing2972\.8Qualification / Segmentation1381\.3Permission / Time Check690\.7Table 5:Complete taxonomy: 41 persuasion strategies in 11 categories plus 9 conversation management labels \(Qwen3:30b annotations\)\. Columns show absolute turn counts and each label’s percentage of allN=10,600N\{=\}10\{,\}600persuader turns \(percentages sum to 100%\)\. Five of the 41 taxonomy strategies \(Punishing Activity, Overloading, Confusion Induction, Promise, and Activation of Impersonal Commitment\) received zero assignments and are not shown\.Similar Articles
LLM Attribution Analysis Across Different Fine-Tuning Strategies and Model Scales for Automated Code Compliance
This paper analyzes how different fine-tuning strategies (FFT, LoRA, quantized LoRA) and model scales affect LLM interpretive behavior for automated code compliance tasks using perturbation-based attribution analysis. The findings show FFT produces more focused attribution patterns than parameter-efficient methods, and larger models develop specific interpretive strategies with diminishing performance returns beyond 7B parameters.
No Universal Courtesy: A Cross-Linguistic, Multi-Model Study of Politeness Effects on LLMs Using the PLUM Corpus
This paper investigates how politeness and impoliteness in user prompts affect LLM responses across three languages and five major models, finding that politeness effects are language- and model-dependent rather than universal. The authors release PLUM, a multilingual corpus of 1,500 human-validated prompts with politeness annotations, and assess response quality using eight factors.
Evaluating LLMs as Human Surrogates in Controlled Experiments
This paper evaluates whether off-the-shelf LLMs can reliably simulate human responses in controlled behavioral experiments by comparing LLM-generated data with human survey responses on accuracy perception. The findings show that while LLMs capture directional effects and aggregate belief-updating patterns, they do not consistently match human-scale effect magnitudes, clarifying when synthetic LLM data can serve as behavioral proxies.
Polarization by Default: Auditing Recommendation Bias in LLM-Based Content Curation
This paper presents a large-scale audit of recommendation biases in LLM-based content curation across OpenAI, Anthropic, and Google using 540,000 simulated selections from Twitter/X, Bluesky, and Reddit data. The study finds that LLMs systematically amplify polarization, exhibit distinct toxicity handling trade-offs, and show significant political leaning bias favoring left-leaning authors despite right-leaning plurality in datasets.
Persona-Assigned Large Language Models Exhibit Human-Like Motivated Reasoning
This paper investigates whether assigning personas to large language models induces human-like motivated reasoning, finding that persona-assigned LLMs show up to 9% reduced veracity discernment and are up to 90% more likely to evaluate scientific evidence in ways congruent with their induced political identity, with prompt-based debiasing largely ineffective.