A comparative study of transformer-based embeddings for topic coherence
Summary
This paper systematically compares the impact of model size on topic quality using seven transformer-based language models in a BERTopic pipeline, finding that model size has negligible effect on topic coherence, suggesting smaller models can perform comparably to larger ones.
View Cached Full Text
Cached at: 05/29/26, 09:12 AM
# A comparative study of transformer-based embeddings for topic coherence
Source: [https://arxiv.org/abs/2605.28832](https://arxiv.org/abs/2605.28832)
[View PDF](https://arxiv.org/pdf/2605.28832)
> Abstract:Topic modeling is a branch of Natural Language Processing \(NLP\) that aims to organize large collections of texts into coherent groups according to word co\-occurrence patterns, with Latent Dirichlet Allocation \(LDA\) remaining one of the most widely used and interpretable probabilistic approaches\. Recent advances in NLP, particularly transformer\-based language models, offer improved document representations\. It is also known that the size of the model \(in terms of number of parameters\) has a significant impact in the performance of the language models on different pre\-defined tasks\. In this study, we systematically examine the effect of model size on topic quality by analyzing the performances of seven transformer\-based language models \(from small models such as MiniLM to large ones such as LLaMA\-2\) in a BERTopic pipeline on a variety of corpora\. Topic quality is evaluated using coherence and divergence metrics following R\{ö\}der et al\. \(2015\)\. Our results indicate that model size, ranging from 22 million to 13 billion parameters, has a negligible impact on the quality of the topic, suggesting that smaller models can achieve comparable performance to larger models\.
## Submission history
From: Willy Rodriguez \[[view email](https://arxiv.org/show-email/04047c1a/2605.28832)\] \[via CCSD proxy\] **\[v1\]**Fri, 10 Apr 2026 08:34:47 UTC \(2,342 KB\)Similar Articles
A Comparative Evaluation of Structural Topic Models and BERTopic for Short, Open-Ended Survey Responses
This paper compares Structural Topic Models (STM) and BERTopic for analyzing short, open-ended survey responses, finding that BERTopic with contextual augmentation yields better topic coherence and interpretability, while STM offers stronger support for inferential covariate analysis.
Transformer Scalability Crisis: The First Comprehensive Empirical Analysis of Performance Walls in Modern Language Models
This paper presents the first large-scale empirical analysis of 118 transformer models, revealing critical performance walls where success rates drop from 88.1% at 512 tokens to 0% at 2048 tokens, challenging prevailing scaling assumptions.
From Correlation to Cause: A Five-Stage Methodology for Feature Analysis in Transformer Language Models
This paper proposes a five-stage methodology for causal feature analysis in transformer language models, demonstrated on GPT-2 small for the IOI task. It finds that features are specifically causal but not necessary, and exposes a gap between detection and causal robustness.
Scaling laws for neural language models
Foundational empirical study demonstrating power-law scaling relationships between language model performance and model size, dataset size, and compute budget, with implications for optimal training allocation and sample efficiency.
Response-free item difficulty modelling for multiple-choice items with fine-tuned transformers: Component-wise representation and multi-task learning
The paper proposes fine-tuning transformer encoders end-to-end for response-free item difficulty modelling of multiple-choice reading comprehension items, with component-wise and multi-task variants, showing that multi-task learning improves in small-sample regimes.