Ideology Prediction of German Political Texts
Summary
This paper presents a transformer-based model that projects political orientation of German texts onto a continuous left-to-right spectrum, achieving high accuracy across multiple corpora including Bundestag plenary notes, Wahl-O-Mat, newspapers, and tweets.
View Cached Full Text
Cached at: 05/15/26, 04:24 AM
Paper page - Ideology Prediction of German Political Texts
Source: https://huggingface.co/papers/2605.14352
Abstract
A transformer-based model projects political orientation on a continuous spectrum using multiple corpora, achieving high accuracy in detecting political bias across different text sources.
Elections represent a crucial milestone in a nation’s ongoing development. To better understand the political rhetoric from various movements, ranging from left to right, we propose atransformer-based modelcapable of projecting thepolitical orientationof a text on a continuousleft-to-right spectrum, represented by a normalized scalar d between -1 and 1. This approach enables analysts to focus on specific segments of the political landscape, such as conservatives, while excluding liberal and far-right movements. Such a task can only be achieved withmulticlass classifiers, provided that the desired orientation is incorporated within one of their predefined classes. To determine the most suitablefoundation modelamong 13 candidate transformers for this task, we constructed four distinct corpora. One corpus comprised annotated plenary notes from the German Bundestag, while another was based on an official online decision-making tool, Wahl-O-Mat. The third corpus consisted of articles from 33 newspapers, each identified by itspolitical orientation, and the fourth included 535,200 tweets from 597 members of the 20th and 21st German Bundestag. To mitigate overfitting, we used two distinct corpora for training and two for testing, respectively. Forin-domain performance,DeBERTa-largeachieved the highestF1 scoreF1=0.844 as well as for the X (Twitter)out-of-domain testACC=0.864. Regarding the newspaperout-of-domain test,Gemma2-2Bexcelled (MAE= 0.172). This study demonstrates that transformer models can recognize political framing in German news at the level of public opinion polls. Our findings suggest that both the model architecture and the availability of domain-specific training data can be as influential as model size for estimating political bias. We discuss methodological limitations and outline directions for improving the robustness of bias measurement.
View arXiv pageView PDFAdd to collection
Get this paper in your agent:
hf papers read 2605\.14352
Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2605.14352 in a model README.md to link it from this page.
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2605.14352 in a dataset README.md to link it from this page.
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2605.14352 in a Space README.md to link it from this page.
Collections including this paper0
No Collection including this paper
Add this paper to acollectionto link it from this page.
Similar Articles
Ideology Prediction of German Political Texts
The paper proposes a transformer-based model to predict political ideology of German political texts on a continuous left-to-right spectrum. The study compares 13 models and finds DeBERTa-large and Gemma2-2B perform best on different tasks.
Political Plasticity: An Analysis of Ideological Adaptability in Large Language Models
This research paper analyzes 'political plasticity' in Large Language Models, finding that newer models exhibit reliable ideological adaptability when prompted with user examples, whereas older models show limited or unstable responses.
Which AI is closest to your political views? I tested 100+ LLMs on the same 117 questions
An independent analysis tested 100+ LLMs on 117 political questions to map their ideological alignment, revealing that DeepSeek and Grok lean left while most other models cluster near the center or right.
TextLDM: Language Modeling with Continuous Latent Diffusion
This paper introduces TextLDM, a method that adapts visual latent diffusion transformers for language modeling by mapping discrete tokens to continuous latents. It demonstrates that this approach, enhanced by representation alignment, matches GPT-2 performance and unifies visual and text generation architectures.
Polarization by Default: Auditing Recommendation Bias in LLM-Based Content Curation
This paper presents a large-scale audit of recommendation biases in LLM-based content curation across OpenAI, Anthropic, and Google using 540,000 simulated selections from Twitter/X, Bluesky, and Reddit data. The study finds that LLMs systematically amplify polarization, exhibit distinct toxicity handling trade-offs, and show significant political leaning bias favoring left-leaning authors despite right-leaning plurality in datasets.