Psychological Constructs in Shared Semantic Space
Summary
This paper proposes a framework using Supervised Semantic Differential to represent psychological constructs as directions in a shared word-embedding space, enabling comparison across different measurement instruments and research traditions.
View Cached Full Text
Cached at: 05/27/26, 09:11 AM
# Psychological Constructs in Shared Semantic Space
Source: [https://arxiv.org/html/2605.26801](https://arxiv.org/html/2605.26801)
###### Abstract
Psychological constructs are often measured in separate instruments, datasets, and research traditions, which makes direct comparison difficult\. This paper proposes a framework for making such constructs semantically commensurate by representing and comparing them as directions in a shared word\-embedding space\. Using Supervised Semantic Differential, we estimate construct\-specific semantic gradients from text–outcome associations and project them onto theoretically motivated reference axes\. As an initial test case, we use Valence, Arousal, and Dominance \(VAD\) as an affective coordinate system\. First, we recover interpretable VAD directions from English word\-level affective norms\. Second, we project semantic gradients for 27 GoEmotions categories into this space and recover the expected organization of emotions, especially along valence and arousal\. Third, we apply the same procedure to Big Five personality domains and facets derived from IPIP\-NEO\-300 item–factor associations\. Domain\-level placements are broadly coherent, while facet\-level results are more exploratory because they rely on sparse questionnaire text\. The results suggest that embedding spaces can support construct\-level comparison across otherwise incommensurable psychological measurements, provided that semantic placements are assessed for stability and interpretability\.
Psychological Constructs in Shared Semantic Space
Hubert PlisieckiIDEAS Research Institutehplisiecki@gmail\.com
## 1Introduction
111Code repository:[https://github\.com/hplisiecki/construct\_space](https://github.com/hplisiecki/construct_space)Psychological science studies many phenomena that are intuitively related but empirically difficult to compare\. Emotions, personality traits, attitudes, symptoms, values, and affective states are usually measured with separate instruments, developed in different research traditions, and scored on incommensurable scales\. As a result, constructs that may be conceptually adjacent are often analyzed only within the boundaries of a single questionnaire, dataset, or theoretical framework\. This limits our ability to ask broader questions about how psychological phenomena relate to one another: for example, whether a personality facet is semantically closer to anxiety, dominance, curiosity, or positive affect; or how constructs from different studies occupy a common space of human meaning\.
Distributional semantics offers a way to address this problem\. Word embeddings provide a shared representational space in which words, texts, and semantic directions can be compared geometrically\(Mikolovet al\.,[2013](https://arxiv.org/html/2605.26801#bib.bib5); Kozlowskiet al\.,[2019](https://arxiv.org/html/2605.26801#bib.bib10)\)\. Rather than treating embeddings only as predictive features, we use them here as a common semantic substrate for psychological measurement\. The central idea is that constructs from different studies can be represented as directions in the same embedding space, making them jointly interpretable even when their original measurements come from different scales, corpora, or research traditions\.
We operationalize this idea using the Supervised Semantic Differential \(SSD\), a recent method inspired by psycholinguistic work on connotative meaning\(Osgoodet al\.,[1957](https://arxiv.org/html/2605.26801#bib.bib8)\)that estimates supervised directions in embedding space from paired text representations and continuous outcome values\(Plisieckiet al\.,[2025](https://arxiv.org/html/2605.26801#bib.bib25)\)\. Given document embeddings and an outcome variable, SSD fits a supervised linear model, back\-projects the fitted coefficient vector into the original embedding space, and interprets the resulting direction through its nearest\-neighbor structure\. In the present paper, we extend SSD from single\-construct interpretation to cross\-construct comparison by estimating multiple construct\-specific directions in the same embedding space and projecting them onto shared reference axes\.
As a first demonstration, we use affective meaning as the reference frame\. Valence, Arousal, and Dominance \(VAD\) are among the best\-validated dimensions of psychological meaning, with strong grounding in human ratings, affect theory, and distributional semantics\(Warrineret al\.,[2013](https://arxiv.org/html/2605.26801#bib.bib3); Russell,[1980](https://arxiv.org/html/2605.26801#bib.bib2); Russell and Mehrabian,[1977](https://arxiv.org/html/2605.26801#bib.bib1)\)\. We therefore estimate VAD directions in a fixed GloVe embedding space and use them as interpretable axes onto which other construct gradients can be projected\. Importantly, the contribution is not limited to affect: VAD serves as a theoretically well\-understood test case for the broader claim that embedding spaces can support shared semantic representations of psychological phenomena\.
We demonstrate the approach across three studies\. Study 1 estimates the three affective reference directions from Warriner et al\.’s English word\-level VAD norms\(Warrineret al\.,[2013](https://arxiv.org/html/2605.26801#bib.bib3)\), yielding unit vectors𝜷^V\\hat\{\\boldsymbol\{\\beta\}\}\_\{V\},𝜷^A\\hat\{\\boldsymbol\{\\beta\}\}\_\{A\}, and𝜷^D\\hat\{\\boldsymbol\{\\beta\}\}\_\{D\}in GloVe space\. Study 2 validates these reference directions by mapping 27 discrete emotion categories from the GoEmotions corpus\(Demszkyet al\.,[2020](https://arxiv.org/html/2605.26801#bib.bib30)\)into the resulting VAD space and comparing the recovered organization with predictions from the circumplex model of affect\(Russell,[1980](https://arxiv.org/html/2605.26801#bib.bib2)\)\. Study 3 applies the same framework to personality measurement by deriving semantic gradients from item\-level responses on the IPIP\-NEO\-300 Big Five inventory\(Goldberg,[1999](https://arxiv.org/html/2605.26801#bib.bib29)\)and projecting all five domains and 30 facets into the affective reference space\.
## 2Related Work
### 2\.1Distributional Semantics as Shared Representational Geometry
Word embeddings represent lexical meaning as points in a continuous vector space, making semantic relations accessible through distances, neighborhoods, and directions\(Mikolovet al\.,[2013](https://arxiv.org/html/2605.26801#bib.bib5); Penningtonet al\.,[2014](https://arxiv.org/html/2605.26801#bib.bib7)\)\. Although developed primarily as representations for NLP systems, these spaces have also been used as analytic objects\. Prior work has used embedding geometry to study semantic change\(Hamiltonet al\.,[2016](https://arxiv.org/html/2605.26801#bib.bib23)\), historical shifts in gender and ethnic stereotypes\(Garget al\.,[2018](https://arxiv.org/html/2605.26801#bib.bib6)\), and the relational structure of cultural concepts such as social class\(Kozlowskiet al\.,[2019](https://arxiv.org/html/2605.26801#bib.bib10)\)\. Related projection\-based work has shown that linear directions in embedding space can recover human judgments of object features such as size, danger, and wetness\(Grandet al\.,[2022](https://arxiv.org/html/2605.26801#bib.bib4)\)\.
These results suggest that embedding spaces can encode interpretable dimensions of social and psychological meaning\. However, most existing work studies a single lexical contrast, cultural dimension, or historical trajectory\. The unit of analysis is usually a word, dictionary, or predefined contrast\. In this paper, the unit of analysis is a psychological construct\. We ask whether constructs measured in different datasets can be represented as directions in a common embedding space and meaningfully compared within that shared geometry\.
This use case places different requirements on representations than standard prediction or retrieval\. For construct\-level measurement, the relevant properties are not only task accuracy, but whether the space supports stable linear directions, projection onto interpretable axes, and qualitative audit via nearest neighbors\. This corresponds to the broader distinction between prediction\-oriented and measurement\-oriented meaning representations\(Plisiecki,[2026](https://arxiv.org/html/2605.26801#bib.bib24)\)\. Static embeddings are useful here because they provide a fixed lexical coordinate system in which directions, projections, and local neighborhoods are directly inspectable\. Contextual models may encode richer information, but their layer dependence and entanglement of semantic, syntactic, and surface\-form signals make them less straightforward as substrates for the present linear measurement workflow\.
### 2\.2Affective Meaning and the Semantic Differential
The affective reference axes used in this paper are motivated by work on connotative meaning and dimensional theories of affect\. The Semantic Differential represents word meaning through ratings on bipolar scales and showed that much of this variation can be summarized by a small set of dimensions, classically evaluation, potency, and activity\(Osgoodet al\.,[1957](https://arxiv.org/html/2605.26801#bib.bib8)\)\. These dimensions map onto how affect theories organize emotion and connotative meaning in terms of valence, arousal, and dominance\(Russell and Mehrabian,[1977](https://arxiv.org/html/2605.26801#bib.bib1); Russell,[1980](https://arxiv.org/html/2605.26801#bib.bib2)\)\. Valence captures the pleasant–unpleasant dimension of meaning; arousal captures activation level, ranging from calm to excited; and dominance captures perceived power or control, ranging from submissive to dominant\. Large\-scale affective norms provide human ratings of these dimensions for thousands of English lemmas\(Warrineret al\.,[2013](https://arxiv.org/html/2605.26801#bib.bib3)\)\.
We use Valence, Arousal, and Dominance \(VAD\) as reference axes because they are well validated, low\-dimensional, and theoretically interpretable\. The claim is not that VAD exhausts psychological meaning\. Rather, VAD provides a controlled test case for the broader framework: if constructs can be represented as directions in a shared embedding space, then they can be projected onto any theoretically motivated reference axes, as long as it has a significant imprint in the embedding space\. The space is therefore used as one interpretable coordinate system for comparing construct gradients, not as the only possible grounding space\.
### 2\.3Psychological Constructs and Cross\-Instrument Comparability
Psychological constructs are typically operationalized within specific instruments, scoring rules, and theoretical traditions\. Personality research, for example, represents individual differences through hierarchical trait models, including broad Five\-Factor Model domains and narrower facets\(McCrae and John,[1992](https://arxiv.org/html/2605.26801#bib.bib27); Johnet al\.,[2008](https://arxiv.org/html/2605.26801#bib.bib28)\)\. Public\-domain IPIP instruments provide item\-level measures of these domains and facets\(Goldberg,[1999](https://arxiv.org/html/2605.26801#bib.bib29); Goldberget al\.,[2006](https://arxiv.org/html/2605.26801#bib.bib26)\)\. Such instruments support within\-scale measurement, but they do not provide a common representation in which constructs from different instruments or datasets can be directly compared\.
Recent work has used text representations to address related problems of construct comparability, including taxonomic incommensurability in psychological measurement\(Wulff and Mata,[2025](https://arxiv.org/html/2605.26801#bib.bib9)\)\. Our approach differs in focus\. Rather than comparing questionnaires through aggregate item similarity alone, we estimate construct\-specific semantic gradients from item–outcome associations and place those gradients in a shared embedding space\. This makes the semantic representation depend not only on item wording, but also on the empirical relationship between item text and the measured construct\.
### 2\.4Supervised Semantic Differential
SSD estimates a semantic gradient from texts paired with a continuous outcome variable and interprets the resulting direction through its nearest\-neighbor structure\(Plisieckiet al\.,[2025](https://arxiv.org/html/2605.26801#bib.bib25)\)\. In its standard form, SSD is used to characterize how meaning varies with one outcome inside one dataset\. The present paper extends this logic to cross\-construct comparison\. We estimate multiple construct gradients in the same embedding space, project them onto shared reference axes, and compare their resulting coordinates\. This turns SSD from a single\-construct interpretive method into a framework for representing and analyzing psychological constructs semantically\.
## 3Method: Supervised Semantic Differential
SSD assumes a collection of documentsdid\_\{i\}paired with continuous outcomesyiy\_\{i\}\. Each document is mapped to a dense vector𝐱i∈ℝD\\mathbf\{x\}\_\{i\}\\in\\mathbb\{R\}^\{D\}via SIF\-weighted word embeddings with removal of the top principal component to reduce anisotropy\(Muet al\.,[2017](https://arxiv.org/html/2605.26801#bib.bib11)\)\. The vectors are compressed with PCA to𝐱~i∈ℝK\\tilde\{\\mathbf\{x\}\}\_\{i\}\\in\\mathbb\{R\}^\{K\}and a linear model is estimated:
yi=α\+𝜷⊤𝐱~i\+ϵi\.y\_\{i\}=\\alpha\+\\boldsymbol\{\\beta\}^\{\\top\}\\tilde\{\\mathbf\{x\}\}\_\{i\}\+\\epsilon\_\{i\}\.The coefficient vector is normalized to unit length to obtain the semantic gradient𝜷^\\hat\{\\boldsymbol\{\\beta\}\}, back\-projected toℝD\\mathbb\{R\}^\{D\}\. The number of PCA componentsKKis selected by a joint interpretability–stability sweep \(AUCK;Plisieckiet al\.,[2025](https://arxiv.org/html/2605.26801#bib.bib25)\)\.
All analyses use GloVe 42B Common Crawl 300\-dimensional embeddings\(Penningtonet al\.,[2014](https://arxiv.org/html/2605.26801#bib.bib7)\), L2\-normalized with one component of anisotropy removed\(Muet al\.,[2017](https://arxiv.org/html/2605.26801#bib.bib11)\)\. Text preprocessing uses spaCyen\_core\_web\_lg\(Montaniet al\.,[2023](https://arxiv.org/html/2605.26801#bib.bib12)\)witha=10−3a=10^\{\-3\}for SIF weighting\.
## 4Study 1: Affective Gradients
### 4\.1Data
We use theWarrineret al\.\([2013](https://arxiv.org/html/2605.26801#bib.bib3)\)affective norms providing Valence, Arousal, and Dominance ratings for 13,915 English words on 1–9 scales\. Each word is treated as a single\-token document; withuse\_full\_doc=Truethe document vector equals the word embedding directly\. Words not present in GloVe are excluded\. The AUCK sweep searchesK∈\{2,4,…,120\}K\\in\\\{2,4,\\ldots,120\\\}; Study 2 uses the same range\.
### 4\.2Results
Table[1](https://arxiv.org/html/2605.26801#S4.T1)reports regression statistics at the sweep\-selectedKKfor each dimension\. All three axes yield highly significant fits \(p<10−10p<10^\{\-10\}\), confirming that the three affective dimensions have clear, recoverable geometric structure in GloVe space\. Valence achieves the strongest fit \(r=\.73r=\.73\), Dominance is intermediate \(r=\.67r=\.67\), and Arousal is the weakest \(r=\.58r=\.58\)\.
Table 1:SSD regression results for VAD axis calibration \(Warriner et al\. norms\)\.
Appendix[A](https://arxiv.org/html/2605.26801#A1)\(Table[4](https://arxiv.org/html/2605.26801#A1.T4)\) reports the full cluster structure at each pole of the three calibrated axes\. The valence axis separates two positive clusters—aesthetic excellence \(*stunning, wonderful, exquisite*\) and celebration/inspiration \(*inspired, celebrate, creative*\)—from two negative clusters of crime/threat \(*accusations, criminal, violent*\) and moral condemnation \(*disgusting, heinous, vile*\)\. The arousal axis yields a more fragmented positive side, with clusters ranging from frenzied intensity \(*rage, frenzy, screaming*\) through violent horror \(*horrific, terrifying, brutal*\) and criminal violence \(*murder, assault, kidnapping*\), in contrast to two compact negative clusters of domestic objects \(*shelf, drawer, container*\) and rural stillness \(*cottage, pastoral, meadow*\)\. The dominance positive pole combines superlative quality \(*wonderful, fantastic, exceptional*\) with excellence and commitment vocabulary \(*dedication, commitment, professionalism*\); its negative pole encompasses debilitating harm \(*severe, crippling, exacerbated*\), disease and epidemic \(*epidemic, plague, cholera*\), horrific suffering \(*horrid, dreadful, ghastly*\), and mental crisis \(*psychotic, paranoia, suicidal*\)\. Together the high explained variance, along with the highly face valid gradient clusters prove that the three affective dimensions are well represented by their corresponding gradients\.
## 5Study 2: Mapping Discrete Emotions
### 5\.1Data and Method
The GoEmotions dataset\(Demszkyet al\.,[2020](https://arxiv.org/html/2605.26801#bib.bib30)\)provides 58,009 Reddit comments, each annotated by multiple raters for 27 emotion categories\. For each emotion, the mean rater\-agreement score across annotators \(yi∈\[0,1\]y\_\{i\}\\in\[0,1\]\) is used as the SSD outcome, with one model fitted per emotion independently using the same PCA sweep settings as Study 1\. Each emotion thus yields a unit gradient vector𝜷^e\\hat\{\\boldsymbol\{\\beta\}\}\_\{e\}in the same GloVe space as the VAD reference vectors\.
Before projection, each𝜷^e\\hat\{\\boldsymbol\{\\beta\}\}\_\{e\}is orthogonalized against the normalized mean emotion vector𝜷¯^\\hat\{\\bar\{\\boldsymbol\{\\beta\}\}\}, removing a shared “generic emotionality” direction, and renormalized to𝜷e⟂\\boldsymbol\{\\beta\}\_\{e\}^\{\\perp\}\. Affective coordinates are then:
\(ve,ae,de\)=\(𝜷e⟂⋅𝜷^V,𝜷e⟂⋅𝜷^A,𝜷e⟂⋅𝜷^D\)\.\(v\_\{e\},\\,a\_\{e\},\\,d\_\{e\}\)=\\bigl\(\\boldsymbol\{\\beta\}\_\{e\}^\{\\perp\}\\cdot\\hat\{\\boldsymbol\{\\beta\}\}\_\{V\},\\;\\boldsymbol\{\\beta\}\_\{e\}^\{\\perp\}\\cdot\\hat\{\\boldsymbol\{\\beta\}\}\_\{A\},\\;\\boldsymbol\{\\beta\}\_\{e\}^\{\\perp\}\\cdot\\hat\{\\boldsymbol\{\\beta\}\}\_\{D\}\\bigr\)\.
### 5\.2Results
Figure 1:VAD coordinates of all 27 GoEmotions categories\. Axes are cosine similarities with the calibrated𝜷^V\\hat\{\\boldsymbol\{\\beta\}\}\_\{V\}\(x\) and𝜷^A\\hat\{\\boldsymbol\{\\beta\}\}\_\{A\}\(y\) vectors; color indicates𝜷^D\\hat\{\\boldsymbol\{\\beta\}\}\_\{D\}cosine \(red = high dominance, blue = low dominance\)\.Table[2](https://arxiv.org/html/2605.26801#S5.T2)reports regression statistics for all 27 emotions\. All models reach significance \(p<10−10p<10^\{\-10\}\), though effect sizes vary considerably\. Gratitude and Admiration achieve the highest correlations \(r=\.45r=\.45and\.44\.44respectively\), while low\-frequency or semantically diffuse emotions such as Pride \(r=\.13r=\.13\), Relief \(r=\.12r=\.12\), and Realization \(r=\.14r=\.14\) show the weakest fits\.
Table 2:SSD regression results for all 27 GoEmotions categories\. Allp<10−10p<10^\{\-10\}\.
Figure[1](https://arxiv.org/html/2605.26801#S5.F1)shows the VAD positions of all 27 emotions\. The valence ordering follows theoretical predictions closely: joy, admiration, love, and excitement anchor the positive end; annoyance, nervousness, embarrassment, anger, and fear anchor the negative end\. On the arousal axis, emotions associated with activation \(anger, fear, nervousness\) score higher than low\-arousal states \(relief, sadness, confusion\)\. Dominance \(color\) closely tracks valence \(r=\.96r=\.96across emotions\), largely recapitulating the positive–negative structure without adding significant independent information\. A detrended analysis controlling for this overlap \(Appendix[C](https://arxiv.org/html/2605.26801#A3)\) reveals that once valence is removed, anger, annoyance, and disgust carry relatively higher residual dominance than fear and nervousness among the high\-arousal negative emotions, consistent with the approach–avoidance distinction in dimensional emotion theory\. The full numerical coordinates are reported in Appendix[B](https://arxiv.org/html/2605.26801#A2)\(Table[5](https://arxiv.org/html/2605.26801#A2.T5)\)\. Cluster tables for all emotion gradients can be found in the Appendix[D](https://arxiv.org/html/2605.26801#A4)\(TableLABEL:tab:ge\-clusters\)\.
## 6Study 3: Mapping Big Five Personality Domains and Facets
Figure 2:Big Five domains \(diamonds\) and facets \(circles\) in VAD affective space\. Axes are cosine similarities with calibrated𝜷^V\\hat\{\\boldsymbol\{\\beta\}\}\_\{V\}\(x\) and𝜷^A\\hat\{\\boldsymbol\{\\beta\}\}\_\{A\}\(y\); color indicates𝜷^D\\hat\{\\boldsymbol\{\\beta\}\}\_\{D\}cosine\. Full coordinates in Appendix[G](https://arxiv.org/html/2605.26801#A7)\.### 6\.1Data
We use the IPIP\-NEO\-300 dataset\(Johnson,[2014](https://arxiv.org/html/2605.26801#bib.bib34)\), comprising 307,313 participants who completed all 300 items of the NEO Personality Inventory\. Items are scored on a 1–5 Likert scale; approximately half of items in each domain are negatively keyed and reverse\-coded when computing domain and facet scores\. The inventory covers five domains \(Neuroticism N, Extraversion E, Openness O, Agreeableness A, Conscientiousness C\) each divided into six facets of 10 items\.
### 6\.2Constructing Item\-Outcome Variables
For each domain \(60 items\) and facet \(10 items\), a factor score is computed as the mean of reverse\-coded item responses across participants\. Pearsonrrbetween each item’s*original*\(pre\-reverse\-code\) responses and the factor score is computed across all 307,313 participants, yielding the SSD outcome variable\. Using original rather than reverse\-coded responses preserves text–label alignment: negatively\-keyed items receive a negativerrconsistent with their wording, and positively\-keyed items retain a positiverr\. The AUCK sweep searchesK∈\{2,4,…,Kmax\}K\\in\\\{2,4,\\ldots,K\_\{\\max\}\\\}whereKmax=min\(Nitems−2,30\)K\_\{\\max\}=\\min\(N\_\{\\text\{items\}\}\-2,\\;30\)— capped at 30 rather than 120 because the small item counts \(60 for domains, 10 for facets\) make higherKKprone to overfitting\.
### 6\.3Results
Domain gradients are projected onto the VAD axes via cosine similarity, as in Study 2 \(Section[5](https://arxiv.org/html/2605.26801#S5)\)\. Table[3](https://arxiv.org/html/2605.26801#S6.T3)reports regression statistics for the five domains\. Four of five reach significance; Agreeableness is non\-significant \(p=\.27p=\.27\), likely reflecting the semantic heterogeneity of its items \(prosocial warmth items mixed with items referencing others’ suffering\)\.
Table 3:SSD regression results for the five Big Five domains \(sign\-consistent method\)\.
Figure[2](https://arxiv.org/html/2605.26801#S6.F2)plots the five domain positions \(squares\) alongside all 30 facets \(circles\), with dominance encoded as color\. The valence ordering C\>\>O\>\>A\>\>E\>\>N is theoretically coherent: Conscientiousness reaches the highest valence \(V=\+0\.42V=\+0\.42\) and dominance \(D=\+0\.45D=\+0\.45\) with the lowest arousal \(A=−0\.24A=\-0\.24\), placing it in the calm\-control region\. Neuroticism occupies the negative\-valence/positive\-arousal/negative\-dominance region consistent with activated negative affect\. Extraversion is distinguished by the highest arousal \(A=\+0\.34A=\+0\.34\)\.
Facets generally cluster within their parent domain’s region of the space, providing qualitative support for the hierarchical structure of the Big Five\. However, facet\-level placements should be interpreted with caution: each model is fit on only 10 items, and only a minority of the 30 facet regressions reach conventional significance \(see Appendix[F](https://arxiv.org/html/2605.26801#A6), Table[9](https://arxiv.org/html/2605.26801#A6.T9)\)\. The facet coordinates reported here and in Appendix[G](https://arxiv.org/html/2605.26801#A7)\(Table[10](https://arxiv.org/html/2605.26801#A7.T10)\) are exploratory and subject to greater sampling uncertainty than the domain\-level estimates\. Full numerical coordinates are provided in Table[10](https://arxiv.org/html/2605.26801#A7.T10)\.
Appendix[E](https://arxiv.org/html/2605.26801#A5)\(TableLABEL:tab:ipip\-clusters\) presents the full cluster structure at each pole of the five domain\-level semantic gradients\. The Conscientiousness positive pole splits into quality assurance vocabulary \(*ensure, achieve, optimal*\) and product excellence descriptors \(*high\-quality, robust, sturdy*\); its negative pole captures impulsivity and disorder \(stupidity/disrespect, behavioral incidents\)\. The Neuroticism positive pole encompasses online hostility/chaos \(*evil, doom, chaos, madness*\) and explicit emotional distress \(*anger, fear, sadness, rage*\), while its negative pole groups budgeting and comfort/luxury vocabulary\. The Openness positive pole is notably diverse—spanning hedonic pleasure, erotic fantasy, and culinary appreciation—while both negative clusters index political and legislative vocabulary, reflecting the Liberalism facet’s co\-occurrence with political discourse in the GloVe training corpus\. The Extraversion positive pole groups playful and joyous vocabulary \(*fun, playful, lighthearted*\) alongside enthusiasm and passion terms; its negative pole is dominated by procedural adverbs \(*efficiently, smoothly, properly, comfortably*\), another corpus artifact rather than a clear introversion signal\. Agreeableness is omitted here as its domain\-level model did not reach significance \(p=\.27p=\.27\)\.
The alignment between these domain gradients and the GoEmotions emotion gradients from Study 2 is reported in Appendix[H](https://arxiv.org/html/2605.26801#A8)\(Table[11](https://arxiv.org/html/2605.26801#A8.T11)\), providing a direct illustration of the framework’s capacity for cross\-construct comparison between otherwise separate measurement traditions\.
## 7Discussion
The three studies support the central premise that word embeddings can provide a shared representational space for psychological constructs\. Rather than treating constructs only as scale scores within isolated instruments, SSD represents them as semantic gradients in a common geometry\. This makes it possible to compare emotion categories, personality domains, and personality facets using the same operations: projection onto reference axes, neighborhood inspection, and cross\-construct positioning; and regardless of whether they were measured during the same study\. The contribution is therefore not only a mapping of constructs into affective space, but a broader framework for making constructs from different studies semantically comparable\.
The VAD analyses provide an initial validation of this framework\. Study 1 shows that Valence, Arousal, and Dominance can be recovered as interpretable semantic directions in GloVe space, although their strength varies\. Study 2 then shows that emotion gradients estimated from GoEmotions occupy theoretically meaningful positions when projected onto these axes: positive emotions cluster on the positive\-valence side, whereas fear, anger, nervousness, embarrassment, and annoyance occupy the negative\-valence and higher\-arousal region\. Dominance is also strongly entangled with valence across emotion categories, which is theoretically expected: affective dominance, potency, and perceived control are often intertwined with evaluative meaning\(Russell and Mehrabian,[1977](https://arxiv.org/html/2605.26801#bib.bib1)\)\. The valence detrended residual pattern \(Appendix[C](https://arxiv.org/html/2605.26801#A3)\) is modest but interpretable, with anger, annoyance, and disgust showing relatively higher dominance than fear and nervousness\. This suggests that the dominance gradient is not reducible to valence alone, even though the two dimensions are strongly coupled\.
The personality analysis demonstrates the cross\-construct use case more directly\. The Big Five domains come from a different measurement tradition than emotion labels, yet their SSD gradients can be projected into the same affective reference space\. The resulting domain placements are theoretically coherent: Conscientiousness occupies a high\-valence, high\-dominance, low\-arousal region, consistent with its association with important life outcomes\(Robertset al\.,[2007](https://arxiv.org/html/2605.26801#bib.bib31)\)and with structural accounts emphasizing industriousness, order, self\-control, and responsibility\(Robertset al\.,[2005](https://arxiv.org/html/2605.26801#bib.bib32)\)\. Neuroticism occupies a negative\-valence, high\-arousal, low\-dominance region, consistent with work linking Neuroticism to threat sensitivity and affective reactivity\(Robinsonet al\.,[2025](https://arxiv.org/html/2605.26801#bib.bib33)\)\. Extraversion is distinguished primarily by high arousal\. Furthermore, aligning these domain gradients with the GoEmotions emotion gradients \(Appendix[H](https://arxiv.org/html/2605.26801#A8)\) produces interpretable cross\-construct correspondences: Neuroticism with remorse and anger, Extraversion with joy and love, and Conscientiousness against embarrassment and nervousness\. The facet\-level results are more exploratory, since each facet is estimated from only ten items and only a minority of facet regressions reach conventional significance\. Nevertheless, they illustrate how shared semantic spaces can expose within\-domain heterogeneity that broad trait scores may obscure\.
The central interpretive constraint is that the method estimates the semantics of construct measurement language, not the latent construct in isolation\. In the personality study, gradients are derived from item texts weighted by their empirical association with domain or facet scores\. The resulting coordinates therefore describe how a construct is linguistically operationalized in the instrument\. This distinction is important because questionnaire items often refer to situations, social comparisons, or external negative content that are not identical to the evaluative meaning of the trait itself\. At the same time, many psychological constructs are not independent of meaning: they are partly constituted through shared appraisals, self\-descriptions, behavioral interpretations, and culturally available concepts\(Danziger,[1997](https://arxiv.org/html/2605.26801#bib.bib35); Sparti,[2001](https://arxiv.org/html/2605.26801#bib.bib36); Markus and Kitayama,[1991](https://arxiv.org/html/2605.26801#bib.bib37)\)\. For such constructs, semantic gradients may capture aspects of their latent instantiation that are not visible from scale scores alone, because they recover the meaning bundle through which the construct is expressed and understood\. More generally, the artifact\-like clusters observed for some personality domains expose an open methodological problem: questionnaire\-derived gradients are estimated from a small number of short item texts, so limited textual variance can leave the direction underconstrained and sensitive to accidental lexical or corpus\-specific regularities\.
Future work should examine how to stabilize questionnaire\-derived construct gradients\. Possible directions include item bootstrapping, stronger regularization, paraphrase augmentation, larger external item pools, and alternative representation models\. Sentence\-level transformer embeddings are a natural candidate, especially given recent work using such representations to compare psychological questionnaires\(Wulff and Mata,[2025](https://arxiv.org/html/2605.26801#bib.bib9)\)\. In the present framework, these embeddings could be regressed onto item–factor correlations rather than used only for aggregate item similarity, potentially capturing compositional item meaning more directly\. However, contextual representations introduce their own difficulty: their manifolds may entangle semantic, syntactic, pragmatic, and surface\-form information in ways that make distilled gradients harder to interpret or compare across constructs\. Thus, a sentence\-embedding gradient may appear more stable while making it less clear whether the signal reflects construct\-relevant semantics or model\-specific artifacts\. Determining when contextual embeddings improve construct\-space measurement, and when they obscure it, remains an important open question\.
## 8Conclusion
This paper introduced a framework for representing psychological constructs as semantic gradients in a shared embedding space\. Across three studies, we showed that Valence–Arousal–Dominance axes can be recovered from word\-level norms, that emotion categories from GoEmotions occupy theoretically meaningful positions when projected onto these axes, and that Big Five personality domains and facets can be situated in the same affective reference space\. More broadly, the results suggest that embedding spaces can serve as infrastructure for construct\-level comparison: psychological phenomena measured in different datasets, instruments, and research traditions can be expressed in a common semantic geometry, enabling comparisons that are not available from isolated scale scores alone\. VAD provides one compact reference frame for this purpose, but the same approach could be extended to other theoretically motivated dimensions such as agency, morality, sociality, concreteness, epistemic certainty, or ideology\. In line with measurement\-oriented views of representation quality\(Plisiecki,[2026](https://arxiv.org/html/2605.26801#bib.bib24)\), construct\-space methods should be evaluated not only by whether a gradient fits its source variable, but also by whether the resulting placement is stable, interpretable, and supported by nearest\-neighbor evidence\.
## Limitations
The approach inherits the known limitations of GloVe embeddings, including insensitivity to negation, context dependence, and biases encoded in the training corpus\. The VAD axes are calibrated on word norms from a convenience sample primarily of native English speakers; cross\-linguistic generalization is not evaluated here\. VAD also provides only one low\-dimensional reference frame for psychological meaning, and its axes should not be treated as exhaustive or fully independent\. In particular, the strong overlap between Valence and Dominance in the emotion analysis shows that projected coordinates can reflect shared semantic structure rather than separable psychological dimensions\.
For the Big Five analysis, item\-factor correlations are computed within the IPIP\-NEO\-300 sample, which over\-represents educated, Western, and English\-speaking populations\(Johnson,[2014](https://arxiv.org/html/2605.26801#bib.bib34)\)\. Facet\-level models are fit with only 10 items, constraining statistical power; confidence intervals via item\-level bootstrapping are a natural next step\. More generally, the method estimates the semantics of construct measurement language, not the latent construct in isolation\. Items referencing negative external content \(e\.g\., Sympathy\) may therefore yield depressed valence estimates that reflect item phrasing rather than trait valence, a limit of purely lexical semantic representations\.
## Ethical Considerations
SSD is not designed as a predictive or profiling technology\. The gradients estimated here characterize aggregate semantic regularities in large datasets and should not be used to make inferences about individuals\. More broadly, semantic placements should not be reified as objective definitions of psychological constructs or as evidence about the traits, capacities, or values of particular persons or groups\. Because embedding spaces can encode social and cultural biases from their training corpora, construct gradients may reproduce or amplify those biases if used without qualitative inspection and external validation\.
All datasets used are publicly available for research purposes\. The IPIP\-NEO\-300 data are anonymous and collected under informed consent\. The text of this manuscript was partially polished with the assistance of a Large Language Model; all revised passages were reviewed and corrected by the author\.
## References
- K\. Danziger \(1997\)Naming the Mind: How Psychology Found Its Language\.1st ed edition,SAGE Publications,London\(eng\)\.External Links:ISBN 978\-1\-4462\-6532\-1 978\-0\-8039\-7763\-1Cited by:[§7](https://arxiv.org/html/2605.26801#S7.p4.1)\.
- D\. Demszky, D\. Movshovitz\-Attias, J\. Ko, A\. Cowen, G\. Nemade, and S\. Ravi \(2020\)GoEmotions: A Dataset of Fine\-Grained Emotions\.InProceedings of the 58th Annual Meeting of the Association for Computational Linguistics,Online,pp\. 4040–4054\(en\)\.External Links:[Link](https://www.aclweb.org/anthology/2020.acl-main.372),[Document](https://dx.doi.org/10.18653/v1/2020.acl-main.372)Cited by:[§1](https://arxiv.org/html/2605.26801#S1.p5.3),[§5\.1](https://arxiv.org/html/2605.26801#S5.SS1.p1.2)\.
- N\. Garg, L\. Schiebinger, D\. Jurafsky, and J\. Zou \(2018\)Word embeddings quantify 100 years of gender and ethnic stereotypes\.Proceedings of the National Academy of Sciences115\(16\) \(en\)\.External Links:ISSN 0027\-8424, 1091\-6490,[Link](https://pnas.org/doi/full/10.1073/pnas.1720347115),[Document](https://dx.doi.org/10.1073/pnas.1720347115)Cited by:[§2\.1](https://arxiv.org/html/2605.26801#S2.SS1.p1.1)\.
- L\. R\. Goldberg, J\. A\. Johnson, H\. W\. Eber, R\. Hogan, M\. C\. Ashton, C\. R\. Cloninger, and H\. G\. Gough \(2006\)The international personality item pool and the future of public\-domain personality measures\.Journal of Research in Personality40\(1\),pp\. 84–96\(en\)\.External Links:ISSN 00926566,[Link](https://linkinghub.elsevier.com/retrieve/pii/S0092656605000553),[Document](https://dx.doi.org/10.1016/j.jrp.2005.08.007)Cited by:[§2\.3](https://arxiv.org/html/2605.26801#S2.SS3.p1.1)\.
- L\. R\. Goldberg \(1999\)A broad\-bandwidth, public domain, personality inventory measuring the lower\-level facets of several five\-factor models\.Personality psychology in Europe7\(1\),pp\. 7–28\.External Links:[Link](http://admin.umt.edu.pk/Media/Site/STD/FileManager/OsamaArticle/26august2015/A%20broad-bandwidth%20inventory.pdf)Cited by:[§1](https://arxiv.org/html/2605.26801#S1.p5.3),[§2\.3](https://arxiv.org/html/2605.26801#S2.SS3.p1.1)\.
- G\. Grand, I\. A\. Blank, F\. Pereira, and E\. Fedorenko \(2022\)Semantic projection recovers rich human knowledge of multiple object features from word embeddings\.Nature Human Behaviour,pp\. 1–13\(en\)\.External Links:ISSN 2397\-3374,[Link](https://www.nature.com/articles/s41562-022-01316-8),[Document](https://dx.doi.org/10.1038/s41562-022-01316-8)Cited by:[§2\.1](https://arxiv.org/html/2605.26801#S2.SS1.p1.1)\.
- W\. L\. Hamilton, J\. Leskovec, and D\. Jurafsky \(2016\)Diachronic Word Embeddings Reveal Statistical Laws of Semantic Change\.InProceedings of the 54th Annual Meeting of the Association for Computational Linguistics \(Volume 1: Long Papers\),Berlin, Germany,pp\. 1489–1501\(en\)\.External Links:[Link](http://aclweb.org/anthology/P16-1141),[Document](https://dx.doi.org/10.18653/v1/P16-1141)Cited by:[§2\.1](https://arxiv.org/html/2605.26801#S2.SS1.p1.1)\.
- O\. P\. John, L\. P\. Naumann, and C\. J\. Soto \(2008\)Paradigm shift to the integrative big five trait taxonomy\.InHandbook of personality: Theory and research,Vol\.3,pp\. 114–158\.External Links:[Link](https://www.elaborer.org/cours/psy7124/lectures/John2008.pdf)Cited by:[§2\.3](https://arxiv.org/html/2605.26801#S2.SS3.p1.1)\.
- J\. A\. Johnson \(2014\)Measuring thirty facets of the Five Factor Model with a 120\-item public domain inventory: Development of the IPIP\-NEO\-120\.Journal of Research in Personality51,pp\. 78–89\(en\)\.External Links:ISSN 00926566,[Link](https://linkinghub.elsevier.com/retrieve/pii/S0092656614000506),[Document](https://dx.doi.org/10.1016/j.jrp.2014.05.003)Cited by:[§6\.1](https://arxiv.org/html/2605.26801#S6.SS1.p1.1),[Limitations](https://arxiv.org/html/2605.26801#Sx1.p2.1)\.
- A\. C\. Kozlowski, M\. Taddy, and J\. A\. Evans \(2019\)The Geometry of Culture: Analyzing the Meanings of Class through Word Embeddings\.American Sociological Review84\(5\),pp\. 905–949\(en\)\.External Links:ISSN 0003\-1224, 1939\-8271,[Link](https://journals.sagepub.com/doi/10.1177/0003122419877135),[Document](https://dx.doi.org/10.1177/0003122419877135)Cited by:[§1](https://arxiv.org/html/2605.26801#S1.p2.1),[§2\.1](https://arxiv.org/html/2605.26801#S2.SS1.p1.1)\.
- H\. R\. Markus and S\. Kitayama \(1991\)Culture and the self: Implications for cognition, emotion, and motivation\.\.Psychological Review98\(2\),pp\. 224–253\(en\)\.External Links:ISSN 1939\-1471, 0033\-295X,[Link](https://doi.apa.org/doi/10.1037/0033-295X.98.2.224),[Document](https://dx.doi.org/10.1037/0033-295X.98.2.224)Cited by:[§7](https://arxiv.org/html/2605.26801#S7.p4.1)\.
- R\. R\. McCrae and O\. P\. John \(1992\)An Introduction to the Five‐Factor Model and Its Applications\.Journal of Personality60\(2\),pp\. 175–215\(en\)\.External Links:ISSN 0022\-3506, 1467\-6494,[Link](https://onlinelibrary.wiley.com/doi/10.1111/j.1467-6494.1992.tb00970.x),[Document](https://dx.doi.org/10.1111/j.1467-6494.1992.tb00970.x)Cited by:[§2\.3](https://arxiv.org/html/2605.26801#S2.SS3.p1.1)\.
- T\. Mikolov, K\. Chen, G\. Corrado, and J\. Dean \(2013\)Efficient estimation of word representations in vector space\.arXiv preprint arXiv:1301\.3781\.Cited by:[§1](https://arxiv.org/html/2605.26801#S1.p2.1),[§2\.1](https://arxiv.org/html/2605.26801#S2.SS1.p1.1)\.
- I\. Montani, M\. Honnibal, A\. Boyd, S\. V\. Landeghem, and H\. Peters \(2023\)Explosion/spaCy: v3\.7\.2: Fixes for APIs and requirements\.Zenodo\.External Links:[Link](https://zenodo.org/doi/10.5281/zenodo.1212303),[Document](https://dx.doi.org/10.5281/ZENODO.1212303)Cited by:[§3](https://arxiv.org/html/2605.26801#S3.p2.1)\.
- J\. Mu, S\. Bhat, and P\. Viswanath \(2017\)All\-but\-the\-Top: Simple and Effective Postprocessing for Word Representations\.arXiv\.Note:Version Number: 2External Links:[Link](https://arxiv.org/abs/1702.01417),[Document](https://dx.doi.org/10.48550/ARXIV.1702.01417)Cited by:[§3](https://arxiv.org/html/2605.26801#S3.p1.4),[§3](https://arxiv.org/html/2605.26801#S3.p2.1)\.
- C\. E\. Osgood, G\. J\. Suci, and P\. H\. Tannenbaum \(1957\)The measurement of meaning\.University of Illinois Press,Urbana\-Champaign\(eng\)\.External Links:ISBN 978\-0\-252\-74539\-3Cited by:[§1](https://arxiv.org/html/2605.26801#S1.p3.1),[§2\.2](https://arxiv.org/html/2605.26801#S2.SS2.p1.1)\.
- J\. Pennington, R\. Socher, and C\. Manning \(2014\)Glove: Global Vectors for Word Representation\.InProceedings of the 2014 Conference on Empirical Methods in Natural Language Processing \(EMNLP\),Doha, Qatar,pp\. 1532–1543\(en\)\.External Links:[Link](http://aclweb.org/anthology/D14-1162),[Document](https://dx.doi.org/10.3115/v1/D14-1162)Cited by:[§2\.1](https://arxiv.org/html/2605.26801#S2.SS1.p1.1),[§3](https://arxiv.org/html/2605.26801#S3.p2.1)\.
- H\. Plisiecki, P\. Lenartowicz, A\. Pokropek, K\. Małyska, and M\. Flakus \(2025\)Measuring Individual Differences in Meaning: The Supervised Semantic Differential\.PsyArXiv\.External Links:[Link](https://osf.io/gvrsb_v3),[Document](https://dx.doi.org/10.31234/osf.io/gvrsb%5Fv3)Cited by:[§1](https://arxiv.org/html/2605.26801#S1.p3.1),[§2\.4](https://arxiv.org/html/2605.26801#S2.SS4.p1.1),[§3](https://arxiv.org/html/2605.26801#S3.p1.7)\.
- H\. Plisiecki \(2026\)The Prediction\-Measurement Gap: Toward Meaning Representations as Scientific Instruments\.arXiv\.Note:Version Number: 1External Links:[Link](https://arxiv.org/abs/2603.10130),[Document](https://dx.doi.org/10.48550/ARXIV.2603.10130)Cited by:[§2\.1](https://arxiv.org/html/2605.26801#S2.SS1.p3.1),[§8](https://arxiv.org/html/2605.26801#S8.p1.1)\.
- B\. W\. Roberts, O\. S\. Chernyshenko, S\. Stark, and L\. R\. Goldberg \(2005\)THE STRUCTURE OF CONSCIENTIOUSNESS: AN EMPIRICAL INVESTIGATION BASED ON SEVEN MAJOR PERSONALITY QUESTIONNAIRES\.Personnel Psychology58\(1\),pp\. 103–139\(en\)\.External Links:ISSN 0031\-5826, 1744\-6570,[Link](https://onlinelibrary.wiley.com/doi/10.1111/j.1744-6570.2005.00301.x),[Document](https://dx.doi.org/10.1111/j.1744-6570.2005.00301.x)Cited by:[§7](https://arxiv.org/html/2605.26801#S7.p3.1)\.
- B\. W\. Roberts, N\. R\. Kuncel, R\. Shiner, A\. Caspi, and L\. R\. Goldberg \(2007\)The Power of Personality: The Comparative Validity of Personality Traits, Socioeconomic Status, and Cognitive Ability for Predicting Important Life Outcomes\.Perspectives on Psychological Science2\(4\),pp\. 313–345\(en\)\.External Links:ISSN 1745\-6916, 1745\-6924,[Link](https://journals.sagepub.com/doi/10.1111/j.1745-6916.2007.00047.x),[Document](https://dx.doi.org/10.1111/j.1745-6916.2007.00047.x)Cited by:[§7](https://arxiv.org/html/2605.26801#S7.p3.1)\.
- M\. D\. Robinson, R\. L\. Irvin, M\. R\. Asad, and H\. Fereidouni \(2025\)Neuroticism’s link to threat sensitivity: Evidence from a dynamic affect reactivity task\.\.Emotion25\(4\),pp\. 884–895\(en\)\.External Links:ISSN 1931\-1516, 1528\-3542,[Link](https://doi.apa.org/doi/10.1037/emo0001462),[Document](https://dx.doi.org/10.1037/emo0001462)Cited by:[§7](https://arxiv.org/html/2605.26801#S7.p3.1)\.
- J\. A\. Russell and A\. Mehrabian \(1977\)Evidence for a three\-factor theory of emotions\.Journal of Research in Personality11\(3\),pp\. 273–294\(en\)\.External Links:ISSN 0092\-6566,[Link](https://www.sciencedirect.com/science/article/pii/009265667790037X),[Document](https://dx.doi.org/10.1016/0092-6566%2877%2990037-X)Cited by:[§1](https://arxiv.org/html/2605.26801#S1.p4.1),[§2\.2](https://arxiv.org/html/2605.26801#S2.SS2.p1.1),[§7](https://arxiv.org/html/2605.26801#S7.p2.1)\.
- J\. A\. Russell \(1980\)A circumplex model of affect\.Journal of Personality and Social Psychology39,pp\. 1161–1178\.Note:Place: USExternal Links:ISSN 1939\-1315,[Document](https://dx.doi.org/10.1037/h0077714)Cited by:[§1](https://arxiv.org/html/2605.26801#S1.p4.1),[§1](https://arxiv.org/html/2605.26801#S1.p5.3),[§2\.2](https://arxiv.org/html/2605.26801#S2.SS2.p1.1)\.
- D\. Sparti \(2001\)Making up People: On Some Looping Effects of the Human Kind \- Institutional Reflexivity or Social Control?\.European Journal of Social Theory4\(3\),pp\. 331–349\(en\)\.External Links:ISSN 1368\-4310, 1461\-7137,[Link](https://journals.sagepub.com/doi/10.1177/13684310122225154),[Document](https://dx.doi.org/10.1177/13684310122225154)Cited by:[§7](https://arxiv.org/html/2605.26801#S7.p4.1)\.
- A\. B\. Warriner, V\. Kuperman, and M\. Brysbaert \(2013\)Norms of valence, arousal, and dominance for 13,915 English lemmas\.Behavior Research Methods45\(4\),pp\. 1191–1207\(en\)\.External Links:ISSN 1554\-3528,[Link](http://link.springer.com/10.3758/s13428-012-0314-x),[Document](https://dx.doi.org/10.3758/s13428-012-0314-x)Cited by:[§1](https://arxiv.org/html/2605.26801#S1.p4.1),[§1](https://arxiv.org/html/2605.26801#S1.p5.3),[§2\.2](https://arxiv.org/html/2605.26801#S2.SS2.p1.1),[§4\.1](https://arxiv.org/html/2605.26801#S4.SS1.p1.1)\.
- D\. U\. Wulff and R\. Mata \(2025\)Semantic embeddings reveal and address taxonomic incommensurability in psychological measurement\.Nature Human Behaviour9\(5\),pp\. 944–954\(en\)\.External Links:ISSN 2397\-3374,[Link](https://www.nature.com/articles/s41562-024-02089-y),[Document](https://dx.doi.org/10.1038/s41562-024-02089-y)Cited by:[§2\.3](https://arxiv.org/html/2605.26801#S2.SS3.p2.1),[§7](https://arxiv.org/html/2605.26801#S7.p5.1)\.
## Appendix AVAD Axis Cluster Tables
Table 4:Cluster structure at the poles of the three calibrated VAD semantic gradients \(top\-100 neighbors,kk\-meansk∈\[2,8\]k\\\!\\in\\\!\[2,8\]\)\.NN= cluster size\. Clusters ordered by centroid–gradient alignment\.## Appendix BGoEmotions VAD Coordinates
Table 5:VAD coordinates of all 27 GoEmotions emotion categories \(cosine similarity with calibrated axes\), sorted by valence\.## Appendix CDetrended Dominance Analysis
Raw dominance correlates strongly with valence across the 27 emotion categories \(r=\.96r=\.96\), largely recapitulating the positive–negative structure\. To isolate dominance\-specific information, we regress dominance on valence \(D=0\.91V−0\.002D=0\.91V\-0\.002\) and compute residualsD′=D−D^D^\{\\prime\}=D\-\\hat\{D\}\. Figure[3](https://arxiv.org/html/2605.26801#A3.F3)plots valence againstD′D^\{\\prime\}, with arousal encoded as color\.
Figure 3:Valence versus valence\-detrended dominance \(D′D^\{\\prime\}\) for all 27 GoEmotions categories\. Color encodes arousal \(red = high, blue = low\)\. Among negative high\-arousal emotions, anger, annoyance, and disgust carry positiveD′D^\{\\prime\}\(relatively dominant given their valence\) whereas fear and nervousness carry negativeD′D^\{\\prime\}\(relatively submissive\), consistent with the approach–avoidance distinction\.The key separation is within the negative high\-arousal cluster: anger \(D′=\+\.06D^\{\\prime\}=\+\.06\), annoyance \(D′=\+\.10D^\{\\prime\}=\+\.10\), and disgust \(D′=\+\.04D^\{\\prime\}=\+\.04\) lie above zero, while fear \(D′=−\.04D^\{\\prime\}=\-\.04\) and nervousness \(D′=−\.08D^\{\\prime\}=\-\.08\) fall below, consistent with anger being associated with approach and control and fear with avoidance and submission\. Table[6](https://arxiv.org/html/2605.26801#A3.T6)reports the full detrended coordinates\.
Table 6:VAD coordinates with valence\-detrended dominanceD′=D−\(0\.91V−0\.002\)D^\{\\prime\}=D\-\(0\.91V\-0\.002\), sorted by valence\.
## Appendix DGoEmotions Cluster Tables
Table 7:Cluster structure at the poles of all 27 GoEmotions semantic gradients\.NN= cluster size; excerpts are the highest\-cosine Reddit comments for that cluster\.EmotionPoleNNCluster Summary \(Top Words / Excerpt\)Admiration\+\+74*Superlative praise*:fantastic, amazing, incredible, wonderful, superb, stunning, terrific, remarkable— “Truly incredible work ;\)”\+\+26*Adverbial excellence*:wonderfully, amazingly, fantastically, exceptionally, incredibly, stunningly, superbly, extraordinarily— “Wonderfully said\!”−\-34*Refusal and blame*:refusing, refused, refuse, refuses, blame, dismiss, ignore, withdraw−\-37*Complaints and worry*:complaining, complain, bother, worried, bothering, worry, whining, whineAmusement\+\+55*Humour and comedy*:hilarious, funny, laugh, comical, jokes, humorous, amusing, joke\+\+45*Laughter tokens*:hahah, hahaha, hahahaha, haha, hahahah, lmao, hahahahaha, ahaha−\-3*Gloom and misery*:challengesorry, challenge\!go\!busuu, aas−\-25*Noise artifact*:wretched, dismal, burdened, lackluster, abysmal, forsaken, abandoning, anemicAnger\+\+54*Insults and slurs*:stupid, moron, dumbass, idiot, morons, dumb, idiots, ignorant— “There are racist morons everywhere\.”\+\+46*Profanity*:fucking, fuck, ass, fucked, pussy, slut, cock, cunt−\-11*Elegant and stylish*:flavour, flavors, sweetness, earthy, hint, colours, freshness, pairing−\-20*Subtle flavours*:elegant, understated, distinctive, complemented, complements, subtle, stylish, eleganceAnnoyance\+\+55*Insults and profanity*:stupid, shit, dumb, crap, fucking, idiot, fuckin, bullshit\+\+45*Ignorance and bigotry*:ignorant, hypocritical, idiotic, moronic, bigoted, arrogant, hateful, disrespectful−\-8*Decorative objects*:amethyst, filigree, bracelet, lapis, gemstones, brooch, moonstone, wedgwood−\-14*Emoticons and thanks*::\), :o\), ^\_^, congrats, thankyou, congratulations, thnx, xoxoApproval\+\+36*Logical reasoning*:theoretically, plausible, logically, insofar, fundamentally, feasible, reasoning, logical\+\+64*Formal argumentation*:therefore, however, necessarily, assume, consider, whether, regard, nevertheless−\-3*Extreme negativity*:wretched, pitiful, abject−\-5*Physical symptoms*:coughing, wheezing, sneezing, choking, faintingCaring\+\+16*Compassionate care*:caring, caregivers, care, compassionate, nurturing, supportive, carers, hospice\+\+13*Calming and soothing*:calm, calming, reassuring, comforting, relaxing, relax, gentle, reassured−\-52*Astonishing and absurd*:astounding, astonishing, horrifying, startling, mind\-boggling, awe\-inspiring, ludicrous, absurd−\-48*Automotive showcase*:showcased, debuted, prototype, sported, roadster, avant\-garde, previewed, lamborghiniConfusion\+\+29*Faulty inference*:presume, infer, deduce, wrongly, logically, implying, mistaken, erroneously\+\+39*Epistemic hedging*:explain, assume, referring, whether, suggest, indicate, understood, presumably−\-11*Exclamation artifacts*:\!\!\!\!\!\!\!\!\!, \!\!\!\!\!\!\!\!\!\!\!, \!\!\!\!\!\!\!\!, \!\!\!\!\!\!\!, \!\!\!\!\!\!\!\!\!\!\!\!, \!\!\!\!\!\!, \!\!\!\!\!\!\!\!\!\!\!\!\!, \!\!\!\!\!\!\!\!\!\!\!\!\!\!−\-13*Gratitude and blessing*:gratitude, kindness, blessing, encouragement, blessings, dedication, loving, graciousCuriosity\+\+28*Wondering and inquiry*:wondered, intrigued, asked, inquired, puzzled, curious, fascinated, pondered\+\+8*Academic research*:anthropological, ethnographic, archaeological, researches, anthropologist, smithsonian, hypotheses, subspecies−\-14*Exclamation artifacts*:\!\!\!\!\!\!\!, \!\!\!\!\!\!, \!\!\!\!\!\!\!\!\!, \!\!\!\!\!\!\!\!, \!\!\!\!\!, \!\!\!\!\!\!\!\!\!\!\!, \!\!\!\!\!\!\!\!\!\!\!\!, \!\!\!\!−\-9*Informal intensifiers*:sooooo, soooo, sooo, soooooo, sooooooo, awsome, unbelievably, wonderfullDesire\+\+31*Aspirational striving*:strive, nurture, inspire, embrace, rediscover, reconnect, unite, cherish\+\+69*Wanting and wishing*:want, ’ll, wish, able, decide, wanting, need, forget−\-15*Unnerving stimuli*:disconcerting, unnerving, startling, disturbing, muffled, shrill, alarming, startled−\-85*Absurdity and mockery*:laughable, ludicrous, idiotic, absurd, moronic, insulting, pathetic, hypocriticalDisappointment\+\+31*Severe harm*:badly, terribly, severely, horribly, hampered, hurt, crippled, painfully\+\+18*Blame and suffering*:blamed, plagued, blame, woes, disastrous, suffered, exacerbated, blaming−\-7*Administrative jargon*:authorizing, authorizes, designate, memorandum, convene, confer, proponent−\-21*Distinctive features*:unique, combines, incorporates, distinctive, utilizes, blend, blends, usesDisapproval\+\+58*Misleading and unethical*:misleading, unfair, untrue, absurd, ludicrous, dishonest, unethical, inaccurate\+\+42*Counter\-argumentation*:contrary, argue, necessarily, justify, oppose, reject, disagree, imply−\-3*Reminiscing*:recollections, reminiscences, reminiscing−\-14*Family relations*:niece, granddaughter, grandpa, grandson, grandma, nephew, nephews, niecesDisgust\+\+49*Awful and nasty*:awful, smelly, horrible, stinky, nasty, ugly, filthy, rotten\+\+51*Disgraceful and shameful*:disgusting, disgraceful, appalling, shameful, reprehensible, heinous, deplorable, horrid−\-17*Serene and tranquil*:serene, tranquil, calm, contemplative, paradise, solace, respite, repose−\-19*Theoretical scepticism*:proponents, doubted, theorists, hypothesis, theoretically, sceptical, hypotheses, hypothesizedEmbarrassment\+\+17*Uncomfortable and awkward*:embarrassing, uncomfortable, painful, annoying, unpleasant, awkward, irritating, embarrassment\+\+16*Ridiculous and pathetic*:ridiculous, ludicrous, laughable, idiotic, outrageous, pathetic, moronic, downright−\-66*Professional expertise*:expertise, established, knowledge, partnership, collaboration, excellence, engineering, dedicated−\-34*Gaming and fantasy*:mc, gaia, disciple, feng, alchemy, blacksmith, shui, co\-opExcitement\+\+45*Festive celebration*:celebration, weekend, festive, celebrate, festivities, thanksgiving, celebrating, summer\+\+55*Wonderful and spectacular*:wonderful, fantastic, delightful, amazing, exciting, fabulous, unforgettable, spectacular−\-10*Incorrect and erroneous*:incorrect, erroneous, inaccurate, faulty, correct, improper, insufficient, defective−\-9*Assertions*:assertion, assertions, asserting, asserted, assert, refute, accusation, presumptionFear\+\+80*Terrifying*:terrifying, frightening, horrifying, horrific, dreadful, horrible, disturbing, terrible— “There’s this which is terrifying to think about”\+\+20*Feeling terrified*:terrified, frightened, scared, fearful, panicked, horrified, fear, alarmed— “That’s what I’m terrified of\.”−\-10*Academic honours*:honors, graduated, honours, scholarship, alumnus, mathematics, laude, majored−\-10*Philosophical tradition*:philosophy, tradition, ethos, tenets, pluralism, egalitarian, epistemology, restsGratitude\+\+46*Thanks and gratitude*:thank, grateful, thankful, congratulations, gratitude, glad, thanking, sincerely\+\+27*Informal thanks*:thanx, thanks, congrats, :\-\), pmhi, pmthanks, amthanks, amhi−\-29*Assault and harassment*:raped, assaulted, harassed, handcuffed, molested, beaten, interrogated, humiliated−\-20*Obsession and delusion*:obsessed, wannabe, vampires, pretends, deranged, inexplicably, loner, imaginesGrief\+\+35*Suffering and illness*:suffering, suffered, suffer, illness, debilitating, injuries, devastating, afflicted\+\+36*Death and tragedy*:killed, death, died, dying, victims, victim, murdered, survived−\-29*Risque slang*:naughty, slutty, kinky, hottie, raunchy, lesbo, freaky, saucy−\-17*Bureaucratic procedure*:bylaws, zoning, by\-laws, bylaw, rulemaking, clarification, omb, fccJoy\+\+16*Cheerful sociability*:joyful, joyous, cheerful, playful, cheery, celebratory, boisterous, fun\-filled— “Wow\! What a joyous companion…How wonderful\!\!\!”\+\+30*Festive celebration*:christmas, celebration, birthday, thanksgiving, easter, holiday, xmas, festive— “This is amazing\. This is how Christmas should be\.”−\-58*Misleading claims*:misleading, allegations, falsely, wrongly, inaccurate, erroneously, alleges, allege−\-42*Removal and damage*:removed, removing, scratched, unsightly, remove, discolored, traces, discolorationLove\+\+21*Beloved and adored*:loved, love, beloved, loves, loving, favorite, lover, adore\+\+18*Fairy\-tale romance*:fairy, tale, fairytale, princess, enchanted, fairies, wonderland, romance−\-47*Unacceptable and unfair*:unacceptable, unfair, misleading, ineffective, unnecessary, inaccurate, ludicrous, unreasonable−\-53*Violations and penalties*:violations, penalties, enforcement, violation, officials, imposed, delays, finesNervousness\+\+51*Anxious and fearful*:anxious, frightened, frustrated, fearful, terrified, scared, worried, agitated\+\+49*Physical symptoms*:nausea, dizziness, headaches, tiredness, discomfort, symptoms, headache, vomiting−\-48*Religious scripture*:scripture, scriptures, teachings, biblical, testament, bible, tradition, scriptural−\-52*Honour and awards*:honor, award, honour, medal, engraved, excellence, honors, heritageOptimism\+\+34*Hopeful endeavour*:regain, endeavors, endeavor, revive, hopeful, momentum, prosperous, embark\+\+11*Thanks and good wishes*:congrats, thanks, glad, luck, thanx, bye, thankful, goodluck−\-37*Shouting and screaming*:yelled, shouted, screamed, yelling, shouting, screaming, angrily, loudly−\-35*Hateful language*:hateful, vulgar, insulting, sexist, obscene, demeaning, derogatory, disgustingPride\+\+27*Grateful and thrilled*:grateful, fortunate, pleased, thrilled, delighted, impressed, proud, immensely\+\+14*Respected and honoured*:respected, honored, esteemed, honoured, renowned, admired, welcomed, fellow−\-30*Technical workarounds*:disable, delete, deleting, config, redirect, manually, disabling, workaround−\-70*Confusion and typos*:confusion, incorrect, weird, avoid, unintentional, typos, misuse, typoRealization\+\+26*Speculation and foresight*:predict, speculate, comprehend, contemplate, anticipate, examine, foresee, theoretically\+\+74*Epistemic hedging*:actually, perhaps, realize, probably, however, might, think, unfortunately−\-7*Informal slang*:hott, hawt, bangin, punchy, loveable, zesty, channelsend−\-35*Thanks and blessing*:thanx, congrats, awsome, bro, thankyou, congratulations, cheers, blessRelief\+\+43*Physical tiredness*:feeling, tired, thankfully, asleep, woke, stay, staying, awake\+\+16*Soothing and relaxing*:soothing, relaxing, relax, calming, soothe, rejuvenate, rejuvenating, relaxation−\-45*Insulting language*:insulting, idiotic, sexist, ludicrous, disrespectful, racist, absurd, demeaning−\-12*Pop artists*:nicki, minaj, gaga, rappers, soulja, will\.i\.am, diss, ashantiRemorse\+\+20*Apologies and regret*:sorry, apologize, apologies, apologise, regret, forgive, regrets, dear\+\+18*Complaining and blame*:complaining, complain, complained, complains, blaming, blame, ignored, ignoring−\-36*Innovative and pioneering*:innovative, pioneering, cutting\-edge, groundbreaking, ground\-breaking, visionary, pioneered, entrepreneurial−\-64*World\-class showcase*:world\-class, showcasing, showcases, showcase, combines, exhilarating, eclectic, showcasedSadness\+\+40*Grief and sorrow*:grief, sadness, sorrow, anguish, despair, loneliness, heartache, disappointment\+\+60*Suffering and misery*:badly, terribly, sad, suffering, hurt, severely, miserable, unhappy−\-6*Mathematical notation*:integers, multiplying, numerals, permutations, longitude, millimeter−\-21*Costumes and attire*:attire, donning, donned, garb, sported, costumes, decked, costumedSurprise\+\+46*Astonished and shocked*:astonished, shocked, amazed, horrified, stunned, surprised, appalled, disgusted\+\+44*Astonishing and extraordinary*:astonishing, incredible, remarkable, astounding, extraordinary, unbelievable, surprising, startling−\-7*Foreign name tokens*:jia, yin, jie, qi, wid, capricorn, leone−\-9*Medical conditions*:prognosis, cystic, scoliosis, predisposition, musculoskeletal, hcc, diseased, descendants
## Appendix EBig Five Domain Cluster Tables
Table 8:Cluster structure at the poles of the five Big Five domain semantic gradients \(top\-100 neighbors,kk\-meansk∈\[2,8\]k\\\!\\in\\\!\[2,8\]\)\.NN= cluster size; top two clusters per pole shown\.DomainPoleNNCluster Summary \(Top Words\)Conscientiousness\+\+57*Quality assurance*:ensure, achieve, ensuring, deliver, quality, optimal, maintain, optimum\+\+43*Product excellence*:high\-quality, robust, durable, elegant, excellent, sleek, sturdy−\-27*Stupidity & disrespect*:stupid, idiotic, irresponsible, obnoxious, disrespectful, crazy−\-21*Behavioral incidents*:caused, incident, alleged, outburst, inexplicable, misconductNeuroticism\+\+25*Online hostility & chaos*:aka, evil, ghost, doom, chaos, dead, madness, lyrics\+\+19*Emotional distress*:anger, fear, feelings, sadness, hatred, rage, emotions, jealousy−\-58*Budgeting & planning*:afford, spend, ordinarily, accustomed, procure, devote, anticipate−\-42*Comfort & luxury*:comfortable, accommodations, comforts, sumptuous, upscale, indulge, unwindExtraversion\+\+56*Enthusiastic energy*:enthusiasm, excitement, joy, passion, happiness, optimism, laughter\+\+44*Playful joviality*:fun, playful, joyous, comical, lighthearted, joyful, whimsical−\-44*Functional instructions*:properly, able, effectively, allow, determine, maintain, utilize−\-28*Efficient task completion*:efficiently, quickly, swiftly, smoothly, rapidly, reliably, promptlyOpenness\+\+28*Hedonic pleasure*:enjoy, fun, exciting, enjoying, fantastic, enjoyable, pleasure\+\+19*Erotic fantasy*:erotic, sensual, kinky, fantasies, steamy, lustful, foreplay−\-49*Legislative politics*:senate, legislative, republican, democrats, legislature, elections−\-47*Political discourse*:political, liberal, conservative, opposition, democracy, politiciansAgreeableness\+\+57*Civic & charitable*:non\-profit, community, nonprofit, volunteer, charity, charities\+\+43*Compassionate connection*:hope, love, loved, understand, friends, lives, grateful−\-56*Geometric objects*†:curved, curving, angled, downwards, curled, thrusting, protruding−\-44*Antagonism & opposition*:opponents, intimidate, thwart, attacking, weaken, discredit, evade†Embedding\-space artifact; see main text\.## Appendix FIPIP Facet Regression Statistics
Table 9:SSD regression statistics for all 30 IPIP\-NEO facets \(sign\-consistent method\)\. All facets useK=2K\{=\}2\(10 items each\)\.## Appendix GBig Five Domain and Facet VAD Coordinates
Table 10:VAD coordinates for all five Big Five domains \(bold\) and 30 facets\. Values are cosine similarities with the calibrated𝜷^V\\hat\{\\boldsymbol\{\\beta\}\}\_\{V\},𝜷^A\\hat\{\\boldsymbol\{\\beta\}\}\_\{A\},𝜷^D\\hat\{\\boldsymbol\{\\beta\}\}\_\{D\}axis vectors\. Domain values reproduced from Table[3](https://arxiv.org/html/2605.26801#S6.T3)for reference\.
## Appendix HEmotion–Personality Alignment
Table[11](https://arxiv.org/html/2605.26801#A8.T11)shows the three most and least aligned GoEmotions categories for each Big Five domain, based on cosine similarity between the orthogonalized emotion gradients \(𝜷e⟂\\boldsymbol\{\\beta\}\_\{e\}^\{\\perp\}, Study 2\) and domain gradients \(𝜷^d\\hat\{\\boldsymbol\{\\beta\}\}\_\{d\}, Study 3\)\.
Neuroticism aligns with remorse and anger, and negatively with desire and approval\. Extraversion aligns with joy and love \(r≈\+\.40r\\approx\+\.40\), and negatively with disapproval and confusion\. Openness and Extraversion show considerable overlap, both loading on excitement and joy; curiosity loads weakly \(r=\+\.02r=\+\.02\)\. Conscientiousness aligns with desire, caring, and admiration, and negatively with embarrassment \(r=−\.32r=\-\.32\) and nervousness \(r=−\.29r=\-\.29\)\. Agreeableness shows uniformly weak alignment \(range\[−\.24,\+\.18\]\[\-\.24,\\,\+\.18\]\), consistent with its non\-significant domain regression \(Table[3](https://arxiv.org/html/2605.26801#S6.T3)\)\.
Table 11:Top and bottom 3 GoEmotions categories by cosine similarity with each Big Five domain gradient\. Emotion betas are orthogonalized \(𝜷e⟂\\boldsymbol\{\\beta\}\_\{e\}^\{\\perp\}, mean emotion direction removed\) to isolate emotion\-specific content\. Domain betas are not orthogonalized: with only five domains whose theoretical independence is contested, removing a mean direction would be arbitrary\. Values are cosine similarities\.Similar Articles
Geometry of Semantic Space: Comparative Study of Discrete and Continuous Models
This paper compares the geometric structures induced by deep learning vector embeddings (CamemBERT) and lexical co-occurrence graph models on the French 'Great National Debate' corpus, finding similar local topology but distinct global organization, highlighting complementarity between the two approaches.
The Proxy Presumption: From Semantic Embeddings to Valid Social Measures
This paper critiques the 'Proxy Presumption' in NLP, where geometric embedding properties are incorrectly equated with social constructs. It introduces the Construct Validity Protocol and Counterfactual Neutralization methods to ensure rigorous validation of social measures derived from semantic embeddings.
Embeddings for Preferences, Not Semantics
This paper introduces a new embedding model designed to capture preferential similarity rather than just semantic similarity, improving preference prediction for collective decision-making systems.
A Geometric Profile of Semantic Information in Text: Frame-Conditional Uniqueness and a Trade-Off Triangle for Scalar Summaries
This paper develops a geometric framework to measure semantic content of texts using sentence embeddings, proposing a three-coordinate semantic profile (novelty, breadth, integration) and a scalar trade-off triangle, validated across synthetic categories and novels.
Large-scale semantic mapping of learner agency and autonomy reveals what measurement and generative AI research overlook
This paper uses large-scale semantic analysis of over 14,000 publications to map definitions of learner agency and autonomy, revealing three dimensions and a systematic underrepresentation of the sociocultural dimension in existing scales. It argues that current generative AI research in education overly focuses on learning regulation, narrowing the behavioral repertoire for AI-mediated learning environments.