A Shared Valence Axis Across Modern LLMs and Human EEG: The Saturation Regularity

arXiv cs.LG Papers

Summary

This paper discovers a shared valence axis (V-axis) across modern LLMs and human EEG signals, showing that a single direction from LLM internal representations aligns with neural responses to emotional stimuli. It also identifies the saturation regularity, explaining why LLM-derived supervision fails to improve EEG decoding and how leveraging residual diversity boosts performance.

arXiv:2606.00129v1 Announce Type: new Abstract: Large language models (LLMs) have emerged as powerful representation learners whose internal features increasingly align with human cognition. We study whether modern LLMs can serve as a lens for understanding neural representations in the human brain, focusing on emotional valence in EEG. We first build a one-dimensional valence direction, the V-axis, from modern LLMs using only nine emotion-evocative sentences. We validate it through zero-shot transfer to sentiment benchmarks and cross-model consistency across fourteen LLMs. We then show that this LLM-derived direction maps onto human neural activity. On a public EEG cohort of 123 subjects watching affective videos, a single linear projection on EEG features tracks the V-axis position of each stimulus. Moreover, 36 EEG emotion classifiers trained without exposure to the V-axis spontaneously rediscover the same direction in their internal representations, suggesting that the same valence structure emerges in both language models and human electrophysiology. Yet this convergence does not provide an effective training signal. We test twenty-five alignment strategies, including knowledge distillation, representational similarity, contrastive, and topographic losses; none improve decoding, and sixteen significantly reduce accuracy. We formalize this result as the saturation regularity: once task labels alone drive a brain-decoding network onto the target direction, additional supervision mainly distorts an already-saturated basin, while the load-bearing within-class residual receives little useful gradient. This regularity also indicates where improvement should come from: the residual subspace unreachable by supervision. Motivated by this insight, we ensemble across residual diversity rather than supervising the basin, improving balanced accuracy by 10.5% over the prior best on FACED, with the same effect replicated on SEED-V.
Original Article
View Cached Full Text

Cached at: 06/02/26, 03:39 PM

# A Shared Valence Axis Across Modern LLMs and Human EEG: The Saturation Regularity
Source: [https://arxiv.org/html/2606.00129](https://arxiv.org/html/2606.00129)
Yousef A\. Radwan King Abdullah University of Science and Technology \(KAUST\) yousef\.radwan@kaust\.edu\.sa&Xuhui Liu King Abdullah University of Science and Technology \(KAUST\) xuhui\.liu@kaust\.edu\.sa&Kilichbek Haydarov King Abdullah University of Science and Technology \(KAUST\) kilichbek\.haydarov@kaust\.edu\.sa&Yuqian Fu King Abdullah University of Science and Technology \(KAUST\) yuqian\.fu@kaust\.edu\.sa&Mohamed Elhoseiny King Abdullah University of Science and Technology \(KAUST\) mohamed\.elhoseiny@kaust\.edu\.sa

###### Abstract

Large language models \(LLMs\) have become powerful general\-purpose representation learners, with growing evidence that their internal features align with human cognition across high\-level concepts\. In this paper, we investigate how modern LLMs can serve as a lens for understanding human brain signals, focusing on the neural representation of emotional valence in EEG\. We first construct a one\-dimensional valence direction, named the V\-axis, from modern LLMs using just nine emotion\-evocative sentences, and verify it through zero\-shot transfer to standard sentiment benchmarks and cross\-model consistency across fourteen LLMs of widely varying scale\. We then show that the LLM\-derived direction maps directly onto the human brain\. On a public EEG cohort of 123 subjects watching affective videos, a single linear projection on EEG features tracks the V\-axis position of each stimulus\. More strikingly, 36 EEG emotion classifiers trained without ever exposing them to the V\-axis spontaneously rediscover it in their internal features\. The same valence direction lives inside language models and inside human electrophysiology\. However, the LLM–brain convergence does not translate into a straightforward training signal\. We test twenty\-five standard alignment recipes spanning knowledge distillation, representational similarity analysis, contrastive, and topographic losses; none help, and sixteen significantly hurt accuracy\. Consequently, we crystallize this finding into the saturation regularity, a previously unrecognized principle that emerges precisely at the LLM–brain interface: once task labels alone have driven a brain\-decoding network onto the target direction, supervision can only deform an already\-saturated basin, while the load\-bearing within\-class residual receives no gradient\. The saturation regularity has two faces\. While it explains why LLM\-derived supervision fails to improve EEG decoding, it equally identifies where the actionable gain actually lives: in the residual subspace that supervision cannot reach\. Acting on the insight, we ensemble across residual diversity rather than supervising the basin, improving balanced accuracy by 10\.5% over the prior best on FACED, a public EEG emotion recognition benchmark, with the same effect replicating on the SEED\-V dataset\.

## 1Introduction

Pass nine short emotion\-evocative sentences through a language model, average the late\-layer hidden states class by class, take the first principal component of the resulting nine vectors, and orient it so the Joy centroid projects positive\. The single direction that comes out — which we call the*valence axis*\(V\-axis\) — predicts movie\-review sentiment at AUC0\.8320\.832zero\-shot, predicts the EEG response of123123subjects watching emotional videos atr=0\.87r\{=\}0\.87, and is rediscovered without supervision by every reasonably\-strong EEG emotion classifier we tested\. Recent work shows that large language models’ internal features align with human cognition across high\-level concepts\(Kimet al\.,[2018](https://arxiv.org/html/2606.00129#bib.bib102); Arditiet al\.,[2024](https://arxiv.org/html/2606.00129#bib.bib72)\); we ask whether that alignment can be turned into a usable bridge to human electrophysiology, both as a probe of what brain\-decoding networks encode and as a training signal that could improve them\.

A nine\-sentence axis\.The nine stories are one per FACED\(Chenet al\.,[2023](https://arxiv.org/html/2606.00129#bib.bib5)\)class; the language model is Qwen3\-4B, chosen as the lead because it sits at the centre of the alignment manifold in our1414\-LLM sweep, with the highest per\-stimulus alignment to a behavioural valence reference \(the recipe itself is model\-agnostic, see Appendix[S3](https://arxiv.org/html/2606.00129#A3)\)\. The SST\-2 benchmark grounds the AUC0\.8320\.832headline: the V\-axis projection alone is within0\.0050\.005of a55k\-example supervised logistic regression\(Socheret al\.,[2013](https://arxiv.org/html/2606.00129#bib.bib109)\)\. And the direction is not idiosyncratic to any one model: across the1414language models from560560M to3232B parameters, the V\-axis converges on essentially the same direction \(within Qwen3, off\-diagonalr=\+0\.995r\{=\}\{\+\}0\.995over1515size\-pairs\)\.

The same axis lives in the brain\.Ther=0\.87r\{=\}0\.87above unpacks as follows\. A ridge regression on160160channel\-band features of the FACED cohort EEG predicts the V\-axis position of each of the2828stimuli; the regressor uses leave\-one\-stimulus\-out cross\-validation, no per\-subject fitting, no auxiliary supervision\. The Qwen3\-4B V\-axis itself reachesr=0\.80r\{=\}0\.80\(p<10−5p<10^\{\-5\}\) on this ridge; a CLIP\-text variant of the V\-axis built from the same2828stimulus descriptions tightens this tor=0\.87r\{=\}0\.87\(p<10−9p<10^\{\-9\}\)\. The third claim — that EEG networks rediscover the V\-axis without ever being shown it — is quantified across3636FACED\-9 checkpoints trained on the99emotion labels alone: per\-checkpoint V\-axis encoding strength in the class\-mean subspace predicts balanced accuracy atr=\+0\.885r\{=\}\{\+\}0\.885\(p=7\.8×10−13p\{=\}7\.8\{\\times\}10^\{\-13\}, Figure[1](https://arxiv.org/html/2606.00129#S1.F1)\)\. The same valence direction is recovered from inside language models and from inside human electrophysiology, by independent procedures\.

![Refer to caption](https://arxiv.org/html/2606.00129v1/x1.png)Figure 1:A single one\-dimensional valence direction, extracted from nine LLM emotion stories, is recovered across language, brain, and EEG\-classifier models\.\(a\)Zero\-shot probe AUC /\|r\|\|r\|of the V\-axis \(PC1 of nine Qwen3\-4B story\-class centroids at the penultimate layer\) on standard text\-sentiment benchmarks; no labels are seen during axis construction\.\(b\)Stimulus\-level scatter of LLM V\-axis projection vs\. cohort EEG response on FACED \(n=28n\{=\}28stimuli,123123subjects\); inset shows the random\-direction permutation null\.\(c\)3636EEG\-classifier checkpoints \(CBraMod and EMOD families\) plotted as their class\-PC1 V\-axis\|r\|\|r\|\(no V\-axis loss used in training\) against FACED 9\-class balanced accuracy; linear fitr=\+0\.885r\{=\}\{\+\}0\.885\. The same one\-dimensional direction explains text sentiment, predicts evoked EEG, and tracks classifier accuracy without sharing data across the three settings\.Yet supervision fails\.The natural way to use such an alignment is an auxiliary loss that pulls the EEG network’s penultimate features toward the V\-axis\. We test2525such recipes spanning knowledge distillation\(Hintonet al\.,[2015](https://arxiv.org/html/2606.00129#bib.bib7)\), representational similarity analysis\(Kriegeskorteet al\.,[2008](https://arxiv.org/html/2606.00129#bib.bib20)\), contrastive losses\(Khoslaet al\.,[2020](https://arxiv.org/html/2606.00129#bib.bib8); van den Oordet al\.,[2018](https://arxiv.org/html/2606.00129#bib.bib95)\), parameter\-efficient fine\-tuning adapters such as LoRA\(Huet al\.,[2022](https://arxiv.org/html/2606.00129#bib.bib11)\), and channel\-targeted topographic losses\(Padakantiet al\.,[2025](https://arxiv.org/html/2606.00129#bib.bib93)\)\. None help\. Sixteen produce statistically significant accuracy decrements \(Δ​BACC≤−0\.013\\Delta\\mathrm\{BACC\}\\leq\-0\.013,p<0\.05p<0\.05paired across five seeds\); zero produce gains\. Five families show*monotonic destruction*: the harder we push the loss, the worse accuracy gets\. The transition is sharp: an intervention with no measurable effect on a weak baseline \(BACC≤0\.62\\leq 0\.62\) becomes a significant negative once the baseline strengthens \(BACC≥0\.66\\geq 0\.66\)\. We call this regularity*saturation*, and it is not the same thing as overfitting — overfitting is memorising the training set, while saturation is what happens once task labels alone have driven the network onto the target direction\. The auxiliary loss then moves the already\-saturated class\-mean component further along the target axis \(alignment increases byΔ​\|r\|=\+0\.01\\Delta\|r\|\{=\}\{\+\}0\.01to\+0\.36\{\+\}0\.36\), but leaves the within\-class residual — the part of the features that distinguishes one trial of an emotion from another — at a numerical zero \(∼10−7\\sim 10^\{\-7\}\)\. Supervision deforms a basin the network is already using and never reaches the load\-bearing direction\.

Where the gain lives\.The same diagnosis identifies where the actionable gain lives: in the within\-class residual subspace that supervision cannot reach\. The class\-mean basin is shared across seeds \(every well\-trained checkpoint recovers the V\-axis at\|r\|∈\[0\.60,0\.77\]\|r\|\\in\[0\.60,0\.77\]\); the residual is seed\-specific\. Ensembling across seeds therefore averages out seed\-specific noise in the residual without disturbing the basin\. Per\-checkpoint residual encoding strength predicts leave\-one\-out ensemble contribution atr=\+0\.74r\{=\}\{\+\}0\.74\(p=0\.014p\{=\}0\.014,n=10n\{=\}10\), and the1010\-checkpoint ensemble reaches FACED\-9 SOTA at0\.6948\\mathbf\{0\.6948\}balanced accuracy \(val\-selected single checkpoint0\.6755\\mathbf\{0\.6755\}\), a\+10\.5%\+10\.5\\%relative improvement over EMOD\(Chenet al\.,[2025](https://arxiv.org/html/2606.00129#bib.bib101)\)’s prior0\.62870\.6287\. The same residual\-diversity prescription replicates on SEED\-V\(Liuet al\.,[2022b](https://arxiv.org/html/2606.00129#bib.bib99)\)\(Appendix[S8](https://arxiv.org/html/2606.00129#A8)\)\. One control sharpens the picture: replacing the language\-model\-derived class prototypes used by the KD step with random orthonormal vectors changes BACC by≤0\.003\\leq 0\.003, so the SOTA gain comes from the*shape*of the loss — a99\-class structure imposed on the features — not from the*content*of the prototypes\. The V\-axis serves as a probe of*what*the network has saturated on, not as the source of the supervised signal\.

### Contributions\.

\(1\)A nine\-sentence*V\-axis probe*: a one\-dimensional valence direction extracted from99language\-model emotion stories that converges across1414LLMs, predicts text sentiment zero\-shot, predicts cohort EEG response, and is rediscovered without supervision by3636EEG classifiers \(§[5](https://arxiv.org/html/2606.00129#S5)–§[7](https://arxiv.org/html/2606.00129#S7)\)\.\(2\)The*saturation regularity*: across2525representation\-alignment recipes on FACED\-9,1616produce significant decrements and0produce gains, with monotonic destruction in55families and a sharp transition in the\[0\.62,0\.66\]\[0\.62,0\.66\]BACC band; the loss moves the class\-mean component byΔ​\|r\|=\+0\.01\\Delta\|r\|\{=\}\{\+\}0\.01to\+0\.36\{\+\}0\.36but the within\-class residual by∼10−7\\sim 10^\{\-7\}\(§[3](https://arxiv.org/html/2606.00129#S3)\)\.\(3\)A residual\-diversity ensemble turning the mechanism into a positive prescription: per\-checkpoint residual encoding predicts leave\-one\-out contribution atr=\+0\.74r\{=\}\{\+\}0\.74, and a1010\-checkpoint ensemble reaches FACED\-9 SOTA at0\.6948\\mathbf\{0\.6948\}\(§[4](https://arxiv.org/html/2606.00129#S4), §[8](https://arxiv.org/html/2606.00129#S8)\)\. All three replicate on SEED\-V\(Liuet al\.,[2022b](https://arxiv.org/html/2606.00129#bib.bib99)\)\(Appendix[S8](https://arxiv.org/html/2606.00129#A8)\)\. Limitations in §[9](https://arxiv.org/html/2606.00129#S9)\.

## 2Related Work

### Concept\-direction supervision\.

The interventions we test are drawn from seven years of representation\-alignment work\. Knowledge distillation\(Hintonet al\.,[2015](https://arxiv.org/html/2606.00129#bib.bib7); Tianet al\.,[2020](https://arxiv.org/html/2606.00129#bib.bib97); Parket al\.,[2019](https://arxiv.org/html/2606.00129#bib.bib51)\)trains a student’s penultimate features against a fixed teacher target\. Representational similarity analysis\(Kriegeskorteet al\.,[2008](https://arxiv.org/html/2606.00129#bib.bib20); Sundaramet al\.,[2024](https://arxiv.org/html/2606.00129#bib.bib125)\)aligns representational geometries through pairwise similarity matrices\. Supervised contrastive losses\(Khoslaet al\.,[2020](https://arxiv.org/html/2606.00129#bib.bib8); van den Oordet al\.,[2018](https://arxiv.org/html/2606.00129#bib.bib95); Radfordet al\.,[2021](https://arxiv.org/html/2606.00129#bib.bib94)\)pull same\-class features together and push different\-class apart\. Parameter\-efficient adaptation\(Huet al\.,[2022](https://arxiv.org/html/2606.00129#bib.bib11); Liuet al\.,[2022a](https://arxiv.org/html/2606.00129#bib.bib98)\)inserts low\-rank or scaling modules whose updates are localised to a chosen geometry\. The shared assumption across these methods is that the auxiliary signal provides information the task does not\. This paper studies a regime where it does not\.

### Concept directions in language models\.

Arditi et al\.\(Arditiet al\.,[2024](https://arxiv.org/html/2606.00129#bib.bib72)\)showed that refusal in instruction\-tuned LMs is mediated by a single direction in residual\-stream activations: ablating it eliminates refusal, adding it induces it\. Earlier work in this family includes\(Liet al\.,[2023](https://arxiv.org/html/2606.00129#bib.bib104); Zouet al\.,[2023](https://arxiv.org/html/2606.00129#bib.bib103); Kimet al\.,[2018](https://arxiv.org/html/2606.00129#bib.bib102)\)\. Concurrent work from Anthropic interpretability\(Sofroniewet al\.,[2026](https://arxiv.org/html/2606.00129#bib.bib6)\)identified emotion\-concept features in production\-scale LMs through sparse autoencoders and showed that ablating these features changes downstream behaviour, complementing the cohort\-level brain alignment we report here\. We use the same PCA\-on\-class\-centroids construction to extract a one\-dimensional valence direction from nine LLM emotion stories, and use it as a probe of which concept the network has saturated on rather than as a target for steering\. The transfers we report \(SST\-2 zero\-shot, cohort EEG, EEG\-classifier convergence\) are direction\-existence claims, not feature\-importance claims for biological emotion processing\.

### Brain–LM alignment\.

Toneva and Wehbe\(Toneva and Wehbe,[2019](https://arxiv.org/html/2606.00129#bib.bib88)\)first showed that mid\-layer transformer activations predict fMRI responses to narrative text; Schrimpf et al\.\(Schrimpfet al\.,[2021](https://arxiv.org/html/2606.00129#bib.bib89)\), Goldstein et al\.\(Goldsteinet al\.,[2022](https://arxiv.org/html/2606.00129#bib.bib90)\), and Caucheteux and King\(Caucheteux and King,[2022](https://arxiv.org/html/2606.00129#bib.bib91)\)extended the picture across MEG, ECoG, and large model families\. Huh et al\. formalised the emerging picture as the*Platonic Representation Hypothesis*\(Huhet al\.,[2024](https://arxiv.org/html/2606.00129#bib.bib92)\)\. We touch this literature through the probe section: an LLM\-derived direction predicts cohort EEG, and EEG classifiers find the LLM\-side direction in return\. Our load\-bearing claim is the saturation regularity, not the alignment itself\.

### Frontal\-alpha asymmetry\.

Davidson’s frontal\-alpha asymmetry\(Davidson,[1992](https://arxiv.org/html/2606.00129#bib.bib118); Coan and Allen,[2004](https://arxiv.org/html/2606.00129#bib.bib120); Davidson,[2004](https://arxiv.org/html/2606.00129#bib.bib119)\)remains the dominant electrophysiological account of approach/withdrawal emotion, developed primarily on static stimuli\. Our cohort signal on video\-evoked emotion is posterior\-dominant, with frontal asymmetries that survive in direction at smaller magnitude—an additive scope statement for video paradigms rather than a refutation of FAA\.

### Ensemble theory\.

The two\-tier mechanism we identify—a saturated class\-mean basin plus a seed\-specific within\-class residual—is a representation\-level form of bias–variance\(Gemanet al\.,[1992](https://arxiv.org/html/2606.00129#bib.bib107); Krogh and Vedelsby,[1995](https://arxiv.org/html/2606.00129#bib.bib108)\)and linear\-mode connectivity\(Frankleet al\.,[2020](https://arxiv.org/html/2606.00129#bib.bib105); Wortsmanet al\.,[2022](https://arxiv.org/html/2606.00129#bib.bib106)\)\. Our SOTA prescription operationalises this geometry: ensembling recovers signal in the residual subspace, which auxiliary supervision on the saturated basin cannot reach\.

### Baselines\.

The FACED\-9 baselines we compare against are CBraMod\(Wanget al\.,[2025](https://arxiv.org/html/2606.00129#bib.bib3)\)\(0\.5720\.572\), EmotionKD\(Liuet al\.,[2023](https://arxiv.org/html/2606.00129#bib.bib100)\)\(0\.6280\.628\), and EMOD\(Chenet al\.,[2025](https://arxiv.org/html/2606.00129#bib.bib101)\)\(0\.62870\.6287\), with REVE\(El Ouahidiet al\.,[2025](https://arxiv.org/html/2606.00129#bib.bib4)\), LaBraM\(Jianget al\.,[2024](https://arxiv.org/html/2606.00129#bib.bib18)\), and EEGPT\(Wanget al\.,[2024](https://arxiv.org/html/2606.00129#bib.bib127)\)as foundation\-model references\.

## 3The Saturation Regularity

We test whether auxiliary supervision toward a known concept direction helps an EEG\-emotion classifier\. The setup uses an EMOD\(Chenet al\.,[2025](https://arxiv.org/html/2606.00129#bib.bib101)\)backbone \(the previous FACED\-9 SOTA\) at two strengths: a vanilla0\.620\.62BACC baseline and the SOTA recipe of §[8](https://arxiv.org/html/2606.00129#S8)at0\.660\.66BACC\. The2525\-recipe screen ran on the vanilla baseline; strongest cells were re\-evaluated on the SOTA recipe to localise the transition in\[0\.62,0\.66\]\[0\.62,0\.66\]\. Each intervention adds an auxiliary loss that pulls the network’s penultimate features toward a target direction\. Targets span LLM\-derived class prototypes \(the V\-axis of §[5](https://arxiv.org/html/2606.00129#S5)\), CLIP\-text embeddings, and frontal\-channel masks matched to the analytical valence ceiling\. Loss families span knowledge distillation\(Hintonet al\.,[2015](https://arxiv.org/html/2606.00129#bib.bib7)\), representational similarity analysis \(RSA\)\(Kriegeskorteet al\.,[2008](https://arxiv.org/html/2606.00129#bib.bib20)\), contrastive losses, parameter\-efficient fine\-tuning \(PEFT\) adapters\(Huet al\.,[2022](https://arxiv.org/html/2606.00129#bib.bib11)\), curriculum schedules, multi\-LLM ensembles, channel\-targeted topographic losses, and anger\-weighted variants \(full table in Appendix[S10](https://arxiv.org/html/2606.00129#A10)\)\.

### Result\.

None help\. Sixteen produce statistically significant accuracy decrements \(p<0\.05p<0\.05, paired across five seeds\); zero produce gains\. Five families show*monotonic destruction*: the harder we push the auxiliary loss, the worse accuracy gets, with no positive setpoint\. Even the anger\-weighted target that analytically maximises the V\-axis ceiling on FACED is among the worst \(Δ​BACC=−0\.054\\Delta\\mathrm\{BACC\}=\-0\.054\): supplying the strongest possible target makes things strictly worse, not better\.

### The transition is sharp and unidirectional in baseline strength\.

An intervention whose effect lies within seed noise on a weak baseline \(BACC≤0\.62\\leq 0\.62\) becomes a statistically significant negative on the strong recipe \(BACC≥0\.66\\geq 0\.66\)\. The transition sits cleanly in the band\[0\.62,0\.66\]\[0\.62,0\.66\]on FACED\-9 \(we make no claim about universal cut\-offs\)\.

> Saturation \(FACED\-9, EMOD backbone, V\-axis target\)\.*For a FACED\-9 classifierfθf\_\{\\theta\}with task\-onlyBACC​\(θ\)\\mathrm\{BACC\}\(\\theta\)in the band\[0\.62,0\.66\]\[0\.62,0\.66\]or higher, adding any V\-axis auxiliary lossℒv\\mathcal\{L\}\_\{v\}atλ\>0\\lambda\>0produces𝔼​\[Δ​BACC\]≤0\\mathbb\{E\}\[\\Delta\\mathrm\{BACC\}\]\\leq 0across all families tested \(16/2516/25significant atp<0\.05p<0\.05,0/250/25positive\)\. The threshold coincides with the regime where the V\-axis encoding strengthρ​\(θ\)\\rho\(\\theta\)saturates across seeds \(§[4](https://arxiv.org/html/2606.00129#S4)\)\.*

### Saturation is not overfitting\.

Overfitting is memorisation of the training set\. Saturation is a property of how task and auxiliary supervision interact once both target the same direction in feature space\. The mechanism check below makes the distinction concrete\.

### Mechanism check\.

The auxiliary loss does push the network’s class\-mean features further along the target direction \(alignment increases byΔ​\|r\|=\+0\.01\\Delta\|r\|=\+0\.01to\+0\.36\+0\.36\), but it leaves the within\-class residual—the part of the features that actually distinguishes one trial of an emotion from another—at a numerical zero \(∼10−7\\sim 10^\{\-7\}\)\. The basin the network was using to classify gets reshaped along the loss direction; the load\-bearing orthogonal residual subspace receives no compensating gradient\. The2525recipes reduce to one principle:*a saturated representation is an unhelpful supervision target*, qualitatively similar to the few\-class distillation regime\(Mülleret al\.,[2020](https://arxiv.org/html/2606.00129#bib.bib128); Looet al\.,[2024](https://arxiv.org/html/2606.00129#bib.bib129); Yuanet al\.,[2020](https://arxiv.org/html/2606.00129#bib.bib130)\), here with the residual\-variance mechanism made explicit\. Full table, transition table, and Path\-B failure \(Δ∈\[−0\.0193,−0\.0145\]\\Delta\\in\[\-0\.0193,\-0\.0145\]\) in Appendix[S10](https://arxiv.org/html/2606.00129#A10)–[S11](https://arxiv.org/html/2606.00129#A11)\.

## 4Mechanism: Saturated Basin, Load\-Bearing Residual

§[3](https://arxiv.org/html/2606.00129#S3)reported that V\-axis supervision moves the class\-mean component of the network’s features by up to\+0\.36\+0\.36but moves the within\-class residual by a numerical zero\. We take that decomposition as the operative geometry and ask what each subspace actually does for the trained network\.

### Different seeds find the same axis\.

Across3636EEG\-emotion checkpoints from two foundation\-model backbones \(CBraMod\(Wanget al\.,[2025](https://arxiv.org/html/2606.00129#bib.bib3)\)and EMOD\(Chenet al\.,[2025](https://arxiv.org/html/2606.00129#bib.bib101)\)\) and six recipe variants, balanced accuracy correlates with V\-axis encoding strength atr=\+0\.885r\{=\}\{\+\}0\.885in the class\-mean subspace \(p=7\.8×10−13p\{=\}7\.8\{\\times\}10^\{\-13\}\) andr=\+0\.738r\{=\}\{\+\}0\.738in the orthogonal residual \(Figure[2](https://arxiv.org/html/2606.00129#S4.F2)\)\. A10001000\-draw matched\-norm random\-direction null places the observed correlation at the93\.593\.5th percentile \(one\-sidedp≈0\.065p\\\!\\approx\\\!0\.065\): trending V\-axis\-specific but not reachingα=0\.05\\alpha\\\!=\\\!0\.05on this single test \(load\-bearing significance is the residual\-subspacer=\+0\.738r\{=\}\{\+\}0\.738,p≈3×10−7p\\\!\\approx\\\!3\{\\times\}10^\{\-7\}\)\. The class\-mean basin is essentially saturated across seeds \(\|r\|∈\[0\.60,0\.77\]\|r\|\\in\[0\.60,0\.77\]\): every well\-trained checkpoint finds the same direction\.

![Refer to caption](https://arxiv.org/html/2606.00129v1/x2.png)Figure 2:EEG classifiers converge to the LLM\-derived V\-axis without being trained on it\.\(a\)3636EEG\-classifier checkpoints across two architectures \(CBraMod and EMOD families\) and six recipe variants\.xx\-axis: absolute correlation between class\-PC1 of the penultimate features and the LLM V\-axis \(no V\-axis loss used in training\)\.yy\-axis: FACED 9\-class balanced accuracy\. Linear fitr=\+0\.885r\{=\}\{\+\}0\.885\(p=7\.8×10−13p\{=\}7\.8\\\!\\times\\\!10^\{\-13\},n=36n\{=\}36\)\.\(b\)Matched\-norm random\-direction null:10001000random unit directions in feature space, each producing its own across\-checkpoint correlation\. The V\-axis sits at the93\.593\.5th percentile \(one\-sidedp=0\.066p\{=\}0\.066\): the convergence is V\-axis\-specific\. The within\-class residual alignment carries the statistically robust signal \(r=\+0\.738r\{=\}\{\+\}0\.738,p=3×10−7p\{=\}3\\\!\\times\\\!10^\{\-7\}; §[4](https://arxiv.org/html/2606.00129#S4)\)\.
### The residual is what averaging recovers\.

Within\-class residual encoding predicts ensemble contribution: across the1010checkpoints used to build our SOTA ensemble \(§[8](https://arxiv.org/html/2606.00129#S8)\), per\-checkpoint residual strength predicts leave\-one\-out contribution at𝐫=\+0\.74\\mathbf\{r\{=\}\{\+\}0\.74\}\(p=0\.014p\{=\}0\.014\)\. Bootstrap CIs and per\-deletion ranges are in Appendix[S12](https://arxiv.org/html/2606.00129#A12)\.

> Residual Contribution Regularity\.*For a saturated EEG classifier with within\-class residual encoding of the V\-axis at strengthρk\\rho\_\{k\}, the leave\-one\-out ensemble contribution scales linearly withρk\\rho\_\{k\}\(p=0\.014p\{=\}0\.014,n=10n\{=\}10, FACED 9\-class\)\.n=10n\{=\}10is small; we report this as an empirical regularity rather than a law\.*

### Direct test: ablate the residual at inference time\.

A directional ablation\(Arditiet al\.,[2024](https://arxiv.org/html/2606.00129#bib.bib72)\)—projecting the LLM\-derived V\-axis residual out of the classifier’s penultimate features at inference time, without retraining—drops accuracy on every one of the1010SOTA\-pool checkpoints \(meanΔ​BACC=−0\.0157\\Delta\\mathrm\{BACC\}=\-0\.0157, meanz≈7\.7z\{\\approx\}7\.7above a matched random\-direction null\)\. The residual carries the load\. This decomposition—saturated basin, load\-bearing residual—is the operative geometry that §[3](https://arxiv.org/html/2606.00129#S3)predicted from the failure of V\-axis supervision and that §[8](https://arxiv.org/html/2606.00129#S8)turns into a positive prescription\.

## 5Probing the Saturated Concept: A Valence Direction in LLMs

To test that the saturation regularity operates on a real semantic axis rather than a noise dimension, we extract one explicitly\. The construction uses nine stories\. Given a language model, we author one short emotion\-evocative story per FACED class and generate5050paraphrases per story by independent LLM calls\. For each class we average the language model’s last\-token activation \(at the penultimate layer\) over all5050paraphrases; the resulting vector is the class*centroid*—a single point in feature space that summarises the model’s representation of that emotion\. Principal component analysis \(PCA\) on the nine centroids gives the V\-axis as the first principal component, oriented so the Joy centroid projects positive\. We use Qwen3\-4B as the lead model and verify the recipe is robust to paraphrase budget, prompt rephrasing, and model choice \(Appendix[S2](https://arxiv.org/html/2606.00129#A2)\)\.

### The same direction across modalities\.

The V\-axis recipe recovers human affect across three settings that share no training data \(Table[1](https://arxiv.org/html/2606.00129#S5.T1)\): the text result is true zero\-shot projection; the EEG and vision results use the same nine\-story recipe re\-extracted in their native modality plus a small linear probe\.

Table 1:Cross\-modal external validity of the V\-axis recipe \(PCA on nine emotion\-class centroids\)\. The Qwen3\-4B V\-axis is the lead model for both Text and Brain rows\. Vision uses CLIP\-image because the language model has no image side; a CLIP\-text variant of the V\-axis on the same EEG ridge reachesr=0\.87r\{=\}0\.87\(p<10−9p<10^\{\-9\}, §[6](https://arxiv.org/html/2606.00129#S6)\)\. Full sentiment, lexicon, and multilingual results in Appendix[S3](https://arxiv.org/html/2606.00129#A3)\.The LLM\-side direction is a strong sentiment classifier:0\.8320\.832SST\-2 AUC matches supervised LR on55k examples \(0\.8370\.837\), beats SBERT prototype\-cosine \(0\.7930\.793;Reimers and Gurevych,[2019](https://arxiv.org/html/2606.00129#bib.bib1)\), and trails355355M RoBERTa\-MNLI zero\-shot \(0\.9120\.912;Liuet al\.,[2019](https://arxiv.org/html/2606.00129#bib.bib2)\) by0\.080\.08—all from nine stories of supervision\.

### The same direction across families\.

We extract the V\-axis from1414language models from 560M to 32B parameters: top\-tier alignment \(r\>0\.85r\{\>\}0\.85against a behavioural valence reference\) holds for all six Qwen3 sizes plus Mistral\-7B; Llama\-4\-Scout, Gemma\-27B, and Gemma\-4 sit in a middle tier; three older small models \(Pythia, TinyLlama, BLOOM\) fall below detection threshold \(per\-LLM table in Appendix[S3](https://arxiv.org/html/2606.00129#A3)\)\. Within the Qwen3 family, the V\-axis is essentially scale\-invariant \(off\-diagonalr=\+0\.995r\{=\}\+0\.995across1515pairs\); across families, correlations are looser \(r=\+0\.585r\{=\}\+0\.585over4848pairs\)\. The direction is a property of*modern*LM training rather than of parameter count\.

### Specificity\.

The recipe is concept\-generic \(2020further concepts produce a working axis,1717exceed AUC0\.950\.95; Appendix[S4](https://arxiv.org/html/2606.00129#A4)\)\. Nonce\-word ablation drops SST\-2 to chance and random\-Gaussian directions give chance lexicon recovery, so the V\-axis is a real semantic direction, not a noise artefact \(Appendix[S6](https://arxiv.org/html/2606.00129#A6)\)\. An Arditi\-style same\-model ablation\(Arditiet al\.,[2024](https://arxiv.org/html/2606.00129#bib.bib72)\)on Qwen3\-4B quantifies how much sentiment lives along the V\-axis: a logistic\-regression probe on the full features reaches SST\-2 AUC0\.9090\.909, drops to0\.9070\.907after projecting the V\-axis out and retraining \(z=\+9\.8z\{=\}\{\+\}9\.8above a2020\-direction random\-direction null\)\. The V\-axis carries detectable sentiment signal but is not the sole sentiment\-bearing direction\.

## 6The Probe Predicts Cohort EEG

We use FACED\(Chenet al\.,[2023](https://arxiv.org/html/2606.00129#bib.bib5)\):123123subjects watching2828emotional video clips,3232EEG channels at250250Hz\. From the cohort EEG averaged over subjects and time, we fit a single*ridge regression*\(a linear model withL2L\_\{2\}regularisation\) predicting the V\-axis projection of each stimulus’s textual description from the160160channel–band features\. The ridge prediction matches the Qwen3\-4B V\-axis at𝐫=\+0\.80\\mathbf\{r\{=\}\{\+\}0\.80\}\(n=28n\{=\}28stimuli,p<10−5p<10^\{\-5\}\)\. Substituting the CLIP\-text V\-axis on the same2828stimulus descriptions reaches𝐫=\+0\.87\\mathbf\{r\{=\}\{\+\}0\.87\}\(p<10−9p<10^\{\-9\}; Figure[3](https://arxiv.org/html/2606.00129#S6.F3)\) —the brain signal aligns with the LLM\-derived direction across two distinct LLM substrates\. Three controls support the brain–LM correlation: a matched\-norm random direction drops the same ridge tor=\+0\.07r\{=\}\{\+\}0\.07\(p=0\.73p\{=\}0\.73\); split\-half subject reliability on per\-stimulus valence isr=\+0\.99r\{=\}\{\+\}0\.99; the same EEG ridge predicts human\-rated valence atr=\+0\.86r\{=\}\{\+\}0\.86, matching the V\-axis prediction to within0\.010\.01—the brain signal is at the same strength a panel of human raters would produce\. Multi\-comparison correction and other controls are in Appendix[S7](https://arxiv.org/html/2606.00129#A7)\.

![Refer to caption](https://arxiv.org/html/2606.00129v1/x3.png)Figure 3:Stimulus\-level alignment between the LLM\-derived valence direction and human EEG\.Each point is one of the2828FACED video stimuli;xxis the LLM V\-axis projection of the textual stimulus description,yyis the single ridge\-regression prediction of that projection from all160160cohort channel–band features \(3232channels×\\times55frequency bands, averaged over123123subjects and over each clip\)\. Cohort Pearsonr=\+0\.87r\{=\}\{\+\}0\.87\(n=28n\{=\}28,p<10−9p<10^\{\-9\}\)\. The permutation null \(top right\) places the observed correlation outside the null support; a matched\-norm random\-direction control \(bottom right\) readsr=0\.07r\{=\}0\.07\. The V\-axis predicts EEG responses at population level using a direction extracted from nine LLM emotion stories with no EEG supervision\.The same picture survives across all1414LLMs \(Qwen3 family meanr=\+0\.796±0\.011r\{=\}\{\+\}0\.796\\pm 0\.011irrespective of size\) and on the held\-out SEED\-V dataset\(Liuet al\.,[2022b](https://arxiv.org/html/2606.00129#bib.bib99)\), where the FACED\-derived V\-axis ranks the five SEED\-V emotions atr=\+0\.96r\{=\}\{\+\}0\.96without retraining\. Appendix[S8](https://arxiv.org/html/2606.00129#A8)re\-derives the V\-axis from scratch on SEED\-V and re\-tests every claim\.

## 7Brain Topography: A Posterior\-Visual Scope Statement

The cohort signal is posterior\-dominant: region\-mean correlation is strongest at occipital electrodes \(\|r\|=0\.21\|r\|=0\.21\) and weakest at frontal \(\|r\|=0\.16\|r\|=0\.16\), with the largest single cell at PO3/γ\\gamma\(r=\+0\.48r=\+0\.48; Figure[4](https://arxiv.org/html/2606.00129#S7.F4)\)\. The signal is carried by an*anger\-versus\-warm\-positive*contrast on99of the2828video clips: cohortr=\+0\.87r\{=\}\+0\.87at PO3/γ\\gammafor Anger, Amusement, and Tenderness, againstr=−0\.02r\{=\}\-0\.02on the remaining1919mid\-valence stimuli\. Removing Anger alone shrinks the cohort correlation to\+0\.33\+0\.33\. Full per\-region tables, the Simpson’s\-paradox per\-subject caveat, time\-resolved late\-positive\-potential \(LPP\) window dynamics, and theta–gamma phase\-amplitude coupling tests are in Appendix[S9](https://arxiv.org/html/2606.00129#A9)\.

![Refer to caption](https://arxiv.org/html/2606.00129v1/x4.png)Figure 4:V\-axis encoding peaks in posterior visual cortex across all five frequency bands\.\(a–e\)MNE\-style scalp topomaps, one per band \(δ\\delta,θ\\theta,α\\alpha,β\\beta,γ\\gamma\), of the cohort Pearson correlation between the LLM V\-axis \(2828stimulus projections\) and per\-channel band power on FACED \(n=123n\{=\}123subjects\)\. Diverging colourmap: positive \(red\) vs\. negative \(blue\) correlation; symmetric limits across bands\. Gold circles mark the top channel per band; text under each topomap names that channel and itsrrvalue \(strongest cell PO3/γ\\gamma,r=\+0\.48r\{=\}\{\+\}0\.48\)\. The colourbar \(centred\) gives Pearsonrr; the small head diagram \(right\) shows the3232\-channel 10–20 montage used in all five panels \(A1/A2 mastoids excluded from plotting\)\.\(f\)Region\-mean\|r\|\|r\|aggregated over all five bands: occipital \(0\.210\.21\)\>\>parietal \(0\.180\.18\)\>\>central \(0\.180\.18\)\>\>frontal \(0\.160\.16\)\. The cohort signal is posterior\-dominant for video\-evoked emotion—a paradigm\-level finding for dynamic stimuli rather than a refutation of the static frontal\-alpha asymmetry literature, which we replicate qualitatively in direction at smaller magnitude\.### This is not a refutation of frontal\-alpha asymmetry \(FAA\)\.

Davidson’s FAA hypothesis\(Davidson,[1992](https://arxiv.org/html/2606.00129#bib.bib118); Coan and Allen,[2004](https://arxiv.org/html/2606.00129#bib.bib120)\)was developed on static stimuli; the cohort signal we report is on dynamic2828\-second video\. Right\-minus\-left frontal\-alpha asymmetry remains positive in our data \(\+0\.006\+0\.006to\+0\.016\+0\.016across F\-pairs\), consistent with FAA at smaller magnitude than the posterior\|r\|\|r\|\. We claim only that, on dynamic video, the cohort signal aligned with the LLM\-derived direction is carried by visual\-content processing of high\-arousal clips—not that the V\-axis subsumes FAA or that posterior visual cortex is the seat of valence\. Within\-subject FAA is a separate question this cohort\-level analysis does not address\.

## 8Two\-Tier Ensemble: The Positive Prescription

The mechanism \(§[4](https://arxiv.org/html/2606.00129#S4)\) predicts that ensembling should recover gain in the within\-class residual subspace where auxiliary supervision cannot reach—the saturated basin is shared, the residual is seed\-specific\. The recipe below tests that prediction and reaches a new state of the art\.

\(a\)\(a\) Headline\.\+0\.0661\+0\.0661abs\. \(\+10\.5%\+10\.5\\%\) over EMOD, the previous FACED\-9 SOTA\.
\(b\)\(b\) Recipe cascade \(∗super\-add\.,p=3×10−4p\{=\}3\\\!\\times\\\!10^\{\-4\}\)\.

Table 2:New FACED 9\-class SOTA\. Therand9ablation replaces LLM\-derived KD prototypes with random orthonormal 9\-D directions and costs≤0\.003\\leq 0\.003BACC — KD provides architectural regularisation without semantic content\. Visual cascade in Appendix[S13](https://arxiv.org/html/2606.00129#A13)\.The cascade starts at the EMOD replication \(0\.6194±0\.0040\.6194\\pm 0\.004, within seed noise of the published0\.62870\.6287\(Chenet al\.,[2025](https://arxiv.org/html/2606.00129#bib.bib101)\)\) and walks up in six steps\. Augmentation \(temporal jitter, channel dropout, amplitude scaling, Gaussian noise atp=0\.6p\{=\}0\.6\) contributes the largest single jump \(\+0\.0149\+0\.0149\)\. Knowledge distillation through99\-D LLM\-derived class prototypes adds\+0\.0096\+0\.0096; arand9ablation \(identical recipe with the99LLM prototypes replaced by random orthonormal directions\) costs at most0\.0030\.003BACC, pairedp\>0\.5p\>0\.5, so KD imposes a99\-direction frame on the features rather than transferring the LLM’s emotion concept\. Depth\-doubling tod=6d\{=\}6adds\+0\.0142\+0\.0142; longer training \(e=150e\{=\}150\) does not move single\-seed but supplies the diversity axis the ensemble exploits\.

### Two\-tier ensemble theory\.

The1010d=6d\{=\}6checkpoints share an identical class\-PC1 basin \(\|r\|∈\[0\.60,0\.77\]\|r\|\\in\[0\.60,0\.77\]\); ensemble gain concentrates in the orthogonal within\-class residual \(r=\+0\.74r\{=\}\{\+\}0\.74,p=0\.014p\{=\}0\.014\)\. Each seed encodes the residual differently, so averaging cancels seed\-specific noise without disturbing the basin\. Mixinge=100e\{=\}100ande=150e\{=\}150ties at0\.65810\.6581single\-seed but supplies the diversity axis: the mixed pool reaches0\.69480\.6948versus0\.67980\.6798fore=100e\{=\}100alone\. Headroom\-monotonicity \(near\-zero on MNIST\), the1010\-checkpoint plateau, and failure of cross\-architecture mixing confirm intra\-architecture training\-length diversity as the operative mechanism \(Appendix[S13](https://arxiv.org/html/2606.00129#A13)\)\.

## 9Discussion: Limitations and Future Work

Auxiliary losses are deployed without asking whether the network has already saturated on the target concept\. We make three steps: a saturation regularity organising2525recipe failures, a residual\-diversity ensemble turning the mechanism into FACED\-9 SOTA, and a nine\-story V\-axis probe showing the saturated concept is a real semantic axis shared with language\. The regularity is scoped to FACED\-9 with a V\-axis substrate; probe transfers are direction\-existence rather than neuroscience claims; the posterior topography is a video\-paradigm statement, not a refutation of FAA\(Davidson,[1992](https://arxiv.org/html/2606.00129#bib.bib118)\)\. The cross\-modal asymmetry — valence transfers across text, vision, and EEG; arousal transfers only in vision — suggests the saturating concept is content\-specific, not recipe\-universal \(Appendix[S6](https://arxiv.org/html/2606.00129#A6)\)\. The same diagnosis predicts where on the population\-vs\-individual spectrum further gain lives: the cohort signal is a 9\-stimulus emotional\-pole contrast \(Appendix[S9](https://arxiv.org/html/2606.00129#A9)\), but a per\-subject channel oracle reaches\|r\|¯=0\.616\\overline\{\|r\|\}=0\.616versus0\.0210\.021at cohort\-fixed channels\. A practical corollary follows for the EEG\-emotion community: aux\-loss design papers should report the receiving baseline’s saturation state, since the same loss can be within seed noise on a0\.620\.62recipe and a significant negative on a0\.660\.66recipe \(§[3](https://arxiv.org/html/2606.00129#S3), Appendix[S10](https://arxiv.org/html/2606.00129#A10)\)\. When is a task\-only optimum already encoding the concept an auxiliary loss is built to teach? Per\-subject residual adaptation \(Appendix[S16](https://arxiv.org/html/2606.00129#A16)\) is the immediate test; whether the same residual\-vs\-basin decomposition is the right design rule beyond saturated 9\-class EEG — and whether the V\-axis recipe applied to other saturated concepts \(truthfulness, refusal, sentiment polarity\) makes the same prescription portable — is the broader question this paper opens\.

## References

- Refusal in language models is mediated by a single direction\.InAdvances in Neural Information Processing Systems,Cited by:[Appendix S16](https://arxiv.org/html/2606.00129#A16.SS0.SSS0.Px4.p1.2),[§1](https://arxiv.org/html/2606.00129#S1.p1.3),[§2](https://arxiv.org/html/2606.00129#S2.SS0.SSS0.Px2.p1.1),[§4](https://arxiv.org/html/2606.00129#S4.SS0.SSS0.Px3.p1.3),[§5](https://arxiv.org/html/2606.00129#S5.SS0.SSS0.Px3.p1.7)\.
- R\. T\. Canolty and R\. T\. Knight \(2010\)The functional role of cross\-frequency coupling\.Trends in Cognitive Sciences14\(11\),pp\. 506–515\.External Links:[Document](https://dx.doi.org/10.1016/j.tics.2010.09.001)Cited by:[Appendix S9](https://arxiv.org/html/2606.00129#A9.SS0.SSS0.Px6.p1.3)\.
- C\. Caucheteux and J\. King \(2022\)Brains and algorithms partially converge in natural language processing\.Communications Biology5,pp\. 134\.External Links:[Document](https://dx.doi.org/10.1038/s42003-022-03036-1)Cited by:[§2](https://arxiv.org/html/2606.00129#S2.SS0.SSS0.Px3.p1.1)\.
- J\. Chen, X\. Wang, C\. Huang,et al\.\(2023\)A large finer\-grained affective computing EEG dataset\.Scientific Data10,pp\. 740\.External Links:[Document](https://dx.doi.org/10.1038/s41597-023-02650-w)Cited by:[Appendix S1](https://arxiv.org/html/2606.00129#A1.SS0.SSS0.Px1.p1.4),[§1](https://arxiv.org/html/2606.00129#S1.p2.9),[§6](https://arxiv.org/html/2606.00129#S6.p1.17),[NeurIPS Paper Checklist](https://arxiv.org/html/2606.00129#Sx1.I1.ix19.p1.1),[NeurIPS Paper Checklist](https://arxiv.org/html/2606.00129#Sx1.I1.ix35.p1.1),[NeurIPS Paper Checklist](https://arxiv.org/html/2606.00129#Sx1.I1.ix47.p1.1),[NeurIPS Paper Checklist](https://arxiv.org/html/2606.00129#Sx1.I1.ix55.p1.1),[NeurIPS Paper Checklist](https://arxiv.org/html/2606.00129#Sx1.I1.ix59.p1.1)\.
- Y\. Chen, S\. Zhao, S\. Li, and G\. Pan \(2025\)EMOD: a unified EEG emotion representation framework leveraging v\-a guided contrastive learning\.arXiv preprint arXiv:2511\.05863\.Note:To appear at AAAI 2026Cited by:[Appendix S1](https://arxiv.org/html/2606.00129#A1.SS0.SSS0.Px1.p1.4),[Figure 16](https://arxiv.org/html/2606.00129#A13.F16),[Figure 16](https://arxiv.org/html/2606.00129#A13.F16.12.6),[Appendix S14](https://arxiv.org/html/2606.00129#A14.SS0.SSS0.Px1.p1.5),[§1](https://arxiv.org/html/2606.00129#S1.p5.11),[§2](https://arxiv.org/html/2606.00129#S2.SS0.SSS0.Px6.p1.3),[§3](https://arxiv.org/html/2606.00129#S3.p1.4),[§4](https://arxiv.org/html/2606.00129#S4.SS0.SSS0.Px1.p1.11),[Table 2](https://arxiv.org/html/2606.00129#S8.T2.3.3.3.2),[§8](https://arxiv.org/html/2606.00129#S8.p2.13),[NeurIPS Paper Checklist](https://arxiv.org/html/2606.00129#Sx1.I1.ix47.p1.1)\.
- J\. A\. Coan and J\. J\. B\. Allen \(2004\)Frontal EEG asymmetry as a moderator and mediator of emotion\.Biological Psychology67\(1\-2\),pp\. 7–50\.External Links:[Document](https://dx.doi.org/10.1016/j.biopsycho.2004.03.002)Cited by:[§2](https://arxiv.org/html/2606.00129#S2.SS0.SSS0.Px4.p1.1),[§7](https://arxiv.org/html/2606.00129#S7.SS0.SSS0.Px1.p1.4)\.
- M\. Codispoti, A\. De Cesarei, and V\. Ferrari \(2023\)Alpha\-band oscillations and emotion: a review of studies on picture perception\.Psychophysiology60\(11\),pp\. e14438\.External Links:[Document](https://dx.doi.org/10.1111/psyp.14438)Cited by:[Appendix S7](https://arxiv.org/html/2606.00129#A7.SS0.SSS0.Px3.p1.13)\.
- R\. J\. Davidson \(1992\)Anterior cerebral asymmetry and the nature of emotion\.Brain and Cognition20\(1\),pp\. 125–151\.External Links:[Document](https://dx.doi.org/10.1016/0278-2626%2892%2990065-T)Cited by:[§2](https://arxiv.org/html/2606.00129#S2.SS0.SSS0.Px4.p1.1),[§7](https://arxiv.org/html/2606.00129#S7.SS0.SSS0.Px1.p1.4),[§9](https://arxiv.org/html/2606.00129#S9.p1.5)\.
- R\. J\. Davidson \(2004\)Well\-being and affective style: neural substrates and biobehavioural correlates\.Philosophical Transactions of the Royal Society B: Biological Sciences359\(1449\),pp\. 1395–1411\.External Links:[Document](https://dx.doi.org/10.1098/rstb.2004.1510)Cited by:[§2](https://arxiv.org/html/2606.00129#S2.SS0.SSS0.Px4.p1.1)\.
- Y\. El Ouahidi, J\. Lys, P\. Thölke, N\. Farrugia, B\. Pasdeloup, V\. Gripon, K\. Jerbi, and G\. Lioi \(2025\)REVE: a foundation model for EEG — adapting to any setup with large\-scale pretraining on 25,000 subjects\.InAdvances in Neural Information Processing Systems,Note:arXiv:2510\.21585Cited by:[§2](https://arxiv.org/html/2606.00129#S2.SS0.SSS0.Px6.p1.3)\.
- J\. Frankle, G\. K\. Dziugaite, D\. M\. Roy, and M\. Carbin \(2020\)Linear mode connectivity and the lottery ticket hypothesis\.InInternational Conference on Machine Learning,Note:arXiv:1912\.05671Cited by:[§2](https://arxiv.org/html/2606.00129#S2.SS0.SSS0.Px5.p1.1)\.
- S\. Geman, E\. Bienenstock, and R\. Doursat \(1992\)Neural networks and the bias/variance dilemma\.Neural Computation4\(1\),pp\. 1–58\.External Links:[Document](https://dx.doi.org/10.1162/neco.1992.4.1.1)Cited by:[§2](https://arxiv.org/html/2606.00129#S2.SS0.SSS0.Px5.p1.1)\.
- A\. Goldstein, Z\. Zada, E\. Buchnik, M\. Schain, A\. Price, B\. Aubrey, S\. A\. Nastase, A\. Feder, D\. Emanuel, A\. Cohen, A\. Jansen, H\. Gazula, G\. Choe, A\. Rao, C\. Kim, C\. Casto, L\. Fanda, W\. Doyle, D\. Friedman, P\. Dugan, L\. Melloni, R\. Reichart, S\. Devore, A\. Flinker, L\. Hasenfratz, O\. Levy, A\. Hassidim, M\. Brenner, Y\. Matias, K\. A\. Norman, O\. Devinsky, and U\. Hasson \(2022\)Shared computational principles for language processing in humans and deep language models\.Nature Neuroscience25\(3\),pp\. 369–380\.External Links:[Document](https://dx.doi.org/10.1038/s41593-022-01026-4)Cited by:[§2](https://arxiv.org/html/2606.00129#S2.SS0.SSS0.Px3.p1.1)\.
- G\. Hajcak, A\. MacNamara, and D\. M\. Olvet \(2010\)Event\-related potentials, emotion, and emotion regulation: an integrative review\.Developmental Neuropsychology35\(2\),pp\. 129–155\.External Links:[Document](https://dx.doi.org/10.1080/87565640903526504)Cited by:[Appendix S7](https://arxiv.org/html/2606.00129#A7.SS0.SSS0.Px3.p1.13)\.
- G\. Hinton, O\. Vinyals, and J\. Dean \(2015\)Distilling the knowledge in a neural network\.arXiv preprint arXiv:1503\.02531\.Cited by:[§1](https://arxiv.org/html/2606.00129#S1.p4.8),[§2](https://arxiv.org/html/2606.00129#S2.SS0.SSS0.Px1.p1.1),[§3](https://arxiv.org/html/2606.00129#S3.p1.4)\.
- E\. J\. Hu, Y\. Shen, P\. Wallis, Z\. Allen\-Zhu, Y\. Li, S\. Wang, L\. Wang, and W\. Chen \(2022\)LoRA: low\-rank adaptation of large language models\.InInternational Conference on Learning Representations,Cited by:[§1](https://arxiv.org/html/2606.00129#S1.p4.8),[§2](https://arxiv.org/html/2606.00129#S2.SS0.SSS0.Px1.p1.1),[§3](https://arxiv.org/html/2606.00129#S3.p1.4)\.
- M\. Huh, B\. Cheung, T\. Wang, and P\. Isola \(2024\)Position: the platonic representation hypothesis\.InInternational Conference on Machine Learning,Note:arXiv:2405\.07987Cited by:[§2](https://arxiv.org/html/2606.00129#S2.SS0.SSS0.Px3.p1.1)\.
- W\. Jiang, L\. Zhao, and B\. Lu \(2024\)Large brain model for learning generic representations with tremendous EEG data in BCI\.InInternational Conference on Learning Representations,Note:arXiv:2405\.18765Cited by:[§2](https://arxiv.org/html/2606.00129#S2.SS0.SSS0.Px6.p1.3)\.
- P\. Khosla, P\. Teterwak, C\. Wang, A\. Sarna, Y\. Tian, P\. Isola, A\. Maschinot, C\. Liu, and D\. Krishnan \(2020\)Supervised contrastive learning\.InAdvances in Neural Information Processing Systems,Cited by:[§1](https://arxiv.org/html/2606.00129#S1.p4.8),[§2](https://arxiv.org/html/2606.00129#S2.SS0.SSS0.Px1.p1.1)\.
- B\. Kim, M\. Wattenberg, J\. Gilmer, C\. Cai, J\. Wexler, F\. Viegas, and R\. Sayres \(2018\)Interpretability beyond feature attribution: quantitative testing with concept activation vectors \(TCAV\)\.InInternational Conference on Machine Learning,Note:arXiv:1711\.11279Cited by:[§1](https://arxiv.org/html/2606.00129#S1.p1.3),[§2](https://arxiv.org/html/2606.00129#S2.SS0.SSS0.Px2.p1.1)\.
- N\. Kriegeskorte, M\. Mur, and P\. Bandettini \(2008\)Representational similarity analysis — connecting the branches of systems neuroscience\.Frontiers in Systems Neuroscience2\.Cited by:[§1](https://arxiv.org/html/2606.00129#S1.p4.8),[§2](https://arxiv.org/html/2606.00129#S2.SS0.SSS0.Px1.p1.1),[§3](https://arxiv.org/html/2606.00129#S3.p1.4)\.
- A\. Krogh and J\. Vedelsby \(1995\)Neural network ensembles, cross validation, and active learning\.InAdvances in Neural Information Processing Systems,pp\. 231–238\.Cited by:[§2](https://arxiv.org/html/2606.00129#S2.SS0.SSS0.Px5.p1.1)\.
- B\. Kurdi, S\. Lozano, and M\. R\. Banaji \(2017\)Introducing the open affective standardized image set \(OASIS\)\.Behavior Research Methods49\(2\),pp\. 457–470\.External Links:[Document](https://dx.doi.org/10.3758/s13428-016-0715-3)Cited by:[Appendix S5](https://arxiv.org/html/2606.00129#A5.SS0.SSS0.Px2.p1.5)\.
- K\. Li, O\. Patel, F\. Viégas, H\. Pfister, and M\. Wattenberg \(2023\)Inference\-time intervention: eliciting truthful answers from a language model\.InAdvances in Neural Information Processing Systems,Note:arXiv:2306\.03341Cited by:[§2](https://arxiv.org/html/2606.00129#S2.SS0.SSS0.Px2.p1.1)\.
- H\. Liu, D\. Tam, M\. Muqeeth, J\. Mohta, T\. Huang, M\. Bansal, and C\. Raffel \(2022a\)Few\-shot parameter\-efficient fine\-tuning is better and cheaper than in\-context learning\.InAdvances in Neural Information Processing Systems,Note:arXiv:2205\.05638Cited by:[§2](https://arxiv.org/html/2606.00129#S2.SS0.SSS0.Px1.p1.1)\.
- W\. Liu, J\. Qiu, W\. Zheng, and B\. Lu \(2022b\)Comparing recognition performance and robustness of multimodal deep learning models for multimodal emotion recognition\.IEEE Transactions on Cognitive and Developmental Systems14\(2\),pp\. 715–729\.External Links:[Document](https://dx.doi.org/10.1109/TCDS.2021.3071170)Cited by:[§1](https://arxiv.org/html/2606.00129#S1.SS0.SSS0.Px1.p1.14),[§1](https://arxiv.org/html/2606.00129#S1.p5.11),[§6](https://arxiv.org/html/2606.00129#S6.p2.3),[NeurIPS Paper Checklist](https://arxiv.org/html/2606.00129#Sx1.I1.ix19.p1.1),[NeurIPS Paper Checklist](https://arxiv.org/html/2606.00129#Sx1.I1.ix35.p1.1),[NeurIPS Paper Checklist](https://arxiv.org/html/2606.00129#Sx1.I1.ix47.p1.1),[NeurIPS Paper Checklist](https://arxiv.org/html/2606.00129#Sx1.I1.ix55.p1.1),[NeurIPS Paper Checklist](https://arxiv.org/html/2606.00129#Sx1.I1.ix59.p1.1)\.
- Y\. Liu, M\. Ott, N\. Goyal, J\. Du, M\. Joshi, D\. Chen, O\. Levy, M\. Lewis, L\. Zettlemoyer, and V\. Stoyanov \(2019\)RoBERTa: a robustly optimized BERT pretraining approach\.arXiv preprint arXiv:1907\.11692\.Cited by:[Table 4](https://arxiv.org/html/2606.00129#A3.T4.16.9.9.1),[§5](https://arxiv.org/html/2606.00129#S5.SS0.SSS0.Px1.p2.7)\.
- Y\. Liu, Z\. Jia, and H\. Wang \(2023\)EmotionKD: a cross\-modal knowledge distillation framework for emotion recognition based on physiological signals\.InProceedings of the 31st ACM International Conference on Multimedia,pp\. 6122–6131\.External Links:[Document](https://dx.doi.org/10.1145/3581783.3612277)Cited by:[§2](https://arxiv.org/html/2606.00129#S2.SS0.SSS0.Px6.p1.3),[Table 2](https://arxiv.org/html/2606.00129#S8.T2.2.2.2.2)\.
- N\. Loo, F\. Iliopoulos, W\. Hu, and E\. Vee \(2024\)LELP: linear projections of teacher embeddings for few\-class distillation\.arXiv preprint arXiv:2409\.20449\.Cited by:[§3](https://arxiv.org/html/2606.00129#S3.SS0.SSS0.Px4.p1.5)\.
- A\. L\. Maas, R\. E\. Daly, P\. T\. Pham, D\. Huang, A\. Y\. Ng, and C\. Potts \(2011\)Learning word vectors for sentiment analysis\.InProceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies,pp\. 142–150\.Cited by:[Table 4](https://arxiv.org/html/2606.00129#A3.T4.16.3.3.1)\.
- N\. Muennighoff, T\. Wang, L\. Sutawika, A\. Roberts, S\. Biderman, T\. Le Scao, M\. S\. Bari, S\. Shen, Z\. X\. Yong, H\. Schoelkopf, X\. Tang, D\. Radev, A\. F\. Aji, K\. Almubarak, S\. Albanie, Z\. Alyafeai, A\. Webson, E\. Raff, and C\. Raffel \(2023\)Crosslingual generalization through multitask finetuning\.InProceedings of the 61st Annual Meeting of the Association for Computational Linguistics \(Volume 1: Long Papers\),pp\. 15991–16111\.Note:arXiv:2211\.01786Cited by:[Appendix S5](https://arxiv.org/html/2606.00129#A5.SS0.SSS0.Px1.p1.1)\.
- M\. M\. Müller, A\. Keil, T\. Gruber, and T\. Elbert \(1999\)Processing of affective pictures modulates right\-hemispheric gamma band EEG activity\.Clinical Neurophysiology110\(11\),pp\. 1913–1920\.External Links:[Document](https://dx.doi.org/10.1016/S1388-2457%2899%2900151-0)Cited by:[Appendix S7](https://arxiv.org/html/2606.00129#A7.SS0.SSS0.Px3.p1.13)\.
- R\. Müller, S\. Kornblith, and G\. Hinton \(2020\)Subclass distillation\.arXiv preprint arXiv:2002\.03936\.Cited by:[§3](https://arxiv.org/html/2606.00129#S3.SS0.SSS0.Px4.p1.5)\.
- S\. Padakanti, K\. Pahwa, R\. Mamidi, B\. R\. Surampudi, M\. Gupta, and S\. R\. Oota \(2025\)Aligning text/speech representations from multimodal models with MEG brain activity during listening\.InProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing,Note:ACL Anthology 2025\.emnlp\-main\.1748Cited by:[§1](https://arxiv.org/html/2606.00129#S1.p4.8)\.
- B\. Pang and L\. Lee \(2005\)Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales\.InProceedings of the 43rd Annual Meeting of the Association for Computational Linguistics \(ACL’05\),pp\. 115–124\.Cited by:[Table 4](https://arxiv.org/html/2606.00129#A3.T4.16.6.6.1)\.
- W\. Park, D\. Kim, Y\. Lu, and M\. Cho \(2019\)Relational knowledge distillation\.InIEEE/CVF Conference on Computer Vision and Pattern Recognition,Cited by:[§2](https://arxiv.org/html/2606.00129#S2.SS0.SSS0.Px1.p1.1)\.
- A\. Radford, J\. W\. Kim, C\. Hallacy, A\. Ramesh, G\. Goh, S\. Agarwal, G\. Sastry, A\. Askell, P\. Mishkin, J\. Clark, G\. Krueger, and I\. Sutskever \(2021\)Learning transferable visual models from natural language supervision\.InInternational Conference on Machine Learning,Cited by:[Appendix S5](https://arxiv.org/html/2606.00129#A5.SS0.SSS0.Px2.p1.5),[§2](https://arxiv.org/html/2606.00129#S2.SS0.SSS0.Px1.p1.1),[NeurIPS Paper Checklist](https://arxiv.org/html/2606.00129#Sx1.I1.ix47.p1.1),[NeurIPS Paper Checklist](https://arxiv.org/html/2606.00129#Sx1.I1.ix63.p1.3)\.
- N\. Reimers and I\. Gurevych \(2019\)Sentence\-BERT: sentence embeddings using siamese BERT\-networks\.InProceedings of the 2019 Conference on Empirical Methods in Natural Language Processing,Cited by:[Table 4](https://arxiv.org/html/2606.00129#A3.T4.16.8.8.1),[§5](https://arxiv.org/html/2606.00129#S5.SS0.SSS0.Px1.p2.7)\.
- S\. Rosenthal, N\. Farra, and P\. Nakov \(2017\)SemEval\-2017 task 4: sentiment analysis in Twitter\.InProceedings of the 11th International Workshop on Semantic Evaluation \(SemEval\-2017\),pp\. 502–518\.Cited by:[Table 4](https://arxiv.org/html/2606.00129#A3.T4.16.5.5.1)\.
- M\. Schrimpf, I\. A\. Blank, G\. Tuckute, C\. Kauf, E\. A\. Hosseini, N\. Kanwisher, J\. B\. Tenenbaum, and E\. Fedorenko \(2021\)The neural architecture of language: integrative modeling converges on predictive processing\.Proceedings of the National Academy of Sciences118\(45\),pp\. e2105646118\.External Links:[Document](https://dx.doi.org/10.1073/pnas.2105646118)Cited by:[§2](https://arxiv.org/html/2606.00129#S2.SS0.SSS0.Px3.p1.1)\.
- R\. Socher, A\. Perelygin, J\. Wu, J\. Chuang, C\. D\. Manning, A\. Ng, and C\. Potts \(2013\)Recursive deep models for semantic compositionality over a sentiment treebank\.InProceedings of the 2013 Conference on Empirical Methods in Natural Language Processing,pp\. 1631–1642\.Cited by:[Table 4](https://arxiv.org/html/2606.00129#A3.T4.16.2.2.1),[§1](https://arxiv.org/html/2606.00129#S1.p2.9)\.
- N\. Sofroniew, I\. Kauvar, W\. Saunders, R\. Chen, T\. Henighan, S\. Hydrie, C\. Citro, A\. Pearce, J\. Tarng, W\. Gurnee, J\. Batson, S\. Zimmerman, K\. Rivoire, K\. Fish, C\. Olah, and J\. Lindsey \(2026\)Emotion concepts and their function in a large language model\.Transformer Circuits\.External Links:[Link](https://transformer-circuits.pub/2026/emotions/index.html)Cited by:[§2](https://arxiv.org/html/2606.00129#S2.SS0.SSS0.Px2.p1.1)\.
- S\. Sundaram, S\. Fu, L\. Muttenthaler, N\. Y\. Tamir, L\. Chai, S\. Kornblith, T\. Darrell, and P\. Isola \(2024\)When does perceptual alignment benefit vision representations?\.InAdvances in Neural Information Processing Systems,Note:arXiv:2410\.10817Cited by:[§2](https://arxiv.org/html/2606.00129#S2.SS0.SSS0.Px1.p1.1)\.
- Y\. Tian, D\. Krishnan, and P\. Isola \(2020\)Contrastive representation distillation\.InInternational Conference on Learning Representations,Note:arXiv:1910\.10699Cited by:[§2](https://arxiv.org/html/2606.00129#S2.SS0.SSS0.Px1.p1.1)\.
- M\. Toneva and L\. Wehbe \(2019\)Interpreting and improving natural\-language processing \(in machines\) with natural language\-processing \(in the brain\)\.InAdvances in Neural Information Processing Systems,Note:arXiv:1905\.11833Cited by:[§2](https://arxiv.org/html/2606.00129#S2.SS0.SSS0.Px3.p1.1)\.
- P\. Vahidi, O\. G\. Sani, and M\. M\. Shanechi \(2024\)Modeling and dissociation of intrinsic and input\-driven neural population dynamics underlying behavior\.Proceedings of the National Academy of Sciences121\(7\),pp\. e2212887121\.External Links:[Document](https://dx.doi.org/10.1073/pnas.2212887121)Cited by:[Appendix S9](https://arxiv.org/html/2606.00129#A9.SS0.SSS0.Px2.p3.4)\.
- A\. van den Oord, Y\. Li, and O\. Vinyals \(2018\)Representation learning with contrastive predictive coding\.arXiv preprint arXiv:1807\.03748\.Cited by:[§1](https://arxiv.org/html/2606.00129#S1.p4.8),[§2](https://arxiv.org/html/2606.00129#S2.SS0.SSS0.Px1.p1.1)\.
- G\. Wang, W\. Liu, Y\. He, C\. Xu, L\. Ma, and H\. Li \(2024\)EEGPT: pretrained transformer for universal and reliable representation of EEG signals\.InAdvances in Neural Information Processing Systems,Cited by:[§2](https://arxiv.org/html/2606.00129#S2.SS0.SSS0.Px6.p1.3)\.
- J\. Wang, S\. Zhao, Z\. Luo, Y\. Zhou, H\. Jiang, S\. Li, T\. Li, and G\. Pan \(2025\)CBraMod: a criss\-cross brain foundation model for EEG decoding\.InInternational Conference on Learning Representations,Note:arXiv:2412\.07236Cited by:[Appendix S1](https://arxiv.org/html/2606.00129#A1.SS0.SSS0.Px1.p1.4),[§2](https://arxiv.org/html/2606.00129#S2.SS0.SSS0.Px6.p1.3),[§4](https://arxiv.org/html/2606.00129#S4.SS0.SSS0.Px1.p1.11),[Table 2](https://arxiv.org/html/2606.00129#S8.T2.1.1.1.2),[NeurIPS Paper Checklist](https://arxiv.org/html/2606.00129#Sx1.I1.ix47.p1.1)\.
- M\. Wortsman, G\. Ilharco, S\. Y\. Gadre, R\. Roelofs, R\. Gontijo\-Lopes, A\. S\. Morcos, H\. Namkoong, A\. Farhadi, Y\. Carmon, S\. Kornblith, and L\. Schmidt \(2022\)Model soups: averaging weights of multiple fine\-tuned models improves accuracy without increasing inference time\.InInternational Conference on Machine Learning,Note:arXiv:2203\.05482Cited by:[§2](https://arxiv.org/html/2606.00129#S2.SS0.SSS0.Px5.p1.1)\.
- L\. Yuan, F\. E\. H\. Tay, G\. Li, T\. Wang, and J\. Feng \(2020\)Revisiting knowledge distillation via label smoothing regularization\.InCVPR,Note:arXiv:1909\.11723Cited by:[§3](https://arxiv.org/html/2606.00129#S3.SS0.SSS0.Px4.p1.5)\.
- X\. Zhang, J\. Zhao, and Y\. LeCun \(2015\)Character\-level convolutional networks for text classification\.InAdvances in Neural Information Processing Systems,pp\. 649–657\.Cited by:[Table 4](https://arxiv.org/html/2606.00129#A3.T4.16.4.4.1)\.
- A\. Zou, L\. Phan, S\. Chen, J\. Campbell, P\. Guo, R\. Ren, A\. Pan, X\. Yin, M\. Mazeika, A\. Dombrowski, S\. Goel, N\. Li, M\. J\. Byun, Z\. Wang, A\. Mallen, S\. Basart, S\. Koyejo, D\. Song, M\. Fredrikson, J\. Z\. Kolter, and D\. Hendrycks \(2023\)Representation engineering: a top\-down approach to AI transparency\.arXiv preprint arXiv:2310\.01405\.Cited by:[§2](https://arxiv.org/html/2606.00129#S2.SS0.SSS0.Px2.p1.1)\.

## NeurIPS Paper Checklist

1. 1\.Claims
2. Question: Do the main claims made in the abstract and introduction accurately reflect the paper’s contributions and scope?
3. Answer:\[Yes\]
4. Justification: The abstract and introduction state five claims supported by experiments in the paper: \(i\) a universal valence axis \(V\-axis\) extractable from 14 LLMs; \(ii\) cohort EEG–LLM correlation ofr=0\.87r=0\.87\(p<10−9p<10^\{\-9\}\) on FACED, replicated atr=\+0\.62r=\+0\.62on SEED\-V; \(iii\) cross\-architecture convergence \(r=\+0\.738r\{=\}\{\+\}0\.738between within\-class residual encoding and BACC across 36 checkpoints; per\-checkpoint within\-residual\|r\|\|r\|predicts ensemble contribution atr=\+0\.74r\{=\}\{\+\}0\.74,p=0\.014p\{=\}0\.014\); \(iv\) a saturation regularity documented across 25 alignment\-intervention families; \(v\) a new FACED 9\-class SOTA at0\.69480\.6948BACC ensemble /0\.67550\.6755single checkpoint\. Each claim is supported in Sections[5](https://arxiv.org/html/2606.00129#S5)–[8](https://arxiv.org/html/2606.00129#S8)with its primary table or figure\.
5. Guidelines: - •The answer\[N/A\]means that the abstract and introduction do not include the claims made in the paper\. - •The abstract and/or introduction should clearly state the claims made, including the contributions made in the paper and important assumptions and limitations\. A\[No\]or\[N/A\]answer to this question will not be perceived well by the reviewers\. - •The claims made should match theoretical and experimental results, and reflect how much the results can be expected to generalize to other settings\. - •It is fine to include aspirational goals as motivation as long as it is clear that these goals are not attained by the paper\.
6. 2\.Limitations
7. Question: Does the paper discuss the limitations of the work performed by the authors?
8. Answer:\[Yes\]
9. Justification: Section[3](https://arxiv.org/html/2606.00129#S3)contains a boxed scope statement \(“Saturation \(FACED\-9, EMOD backbone, V\-axis target\)”\) naming the empirical\-not\-mathematical scope and the FACED\-specific BACC threshold\[0\.62,0\.66\]\[0\.62,0\.66\]\. Section[9](https://arxiv.org/html/2606.00129#S9)explicitly states the regularity is scoped to FACED\-9 with a measurable V\-axis substrate, with no claim beyond, and that probe transfers \(SST\-2, cohort EEG, classifier convergence\) are direction\-existence rather than neuroscience claims\. Posterior\-topography is framed as a video\-paradigm statement rather than an FAA refutation\. The cross\-architecture class\-PC1 correlation is softened by a 93\.5th\-percentile null\-direction control \(n=1000n=1000, one\-sidedp≈0\.065p\\\!\\approx\\\!0\.065\) in Section[4](https://arxiv.org/html/2606.00129#S4)\. Negative results catalogue \(theta\-gamma PAC, arousal\-axis brain\-side failure, Path\-B mixing\) in Appendix[S11](https://arxiv.org/html/2606.00129#A11)\.
10. Guidelines: - •The answer\[N/A\]means that the paper has no limitation while the answer\[No\]means that the paper has limitations, but those are not discussed in the paper\. - •The authors are encouraged to create a separate “Limitations” section in their paper\. - •The paper should point out any strong assumptions and how robust the results are to violations of these assumptions \(e\.g\., independence assumptions, noiseless settings, model well\-specification, asymptotic approximations only holding locally\)\. The authors should reflect on how these assumptions might be violated in practice and what the implications would be\. - •The authors should reflect on the scope of the claims made, e\.g\., if the approach was only tested on a few datasets or with a few runs\. In general, empirical results often depend on implicit assumptions, which should be articulated\. - •The authors should reflect on the factors that influence the performance of the approach\. For example, a facial recognition algorithm may perform poorly when image resolution is low or images are taken in low lighting\. Or a speech\-to\-text system might not be used reliably to provide closed captions for online lectures because it fails to handle technical jargon\. - •The authors should discuss the computational efficiency of the proposed algorithms and how they scale with dataset size\. - •If applicable, the authors should discuss possible limitations of their approach to address problems of privacy and fairness\. - •While the authors might fear that complete honesty about limitations might be used by reviewers as grounds for rejection, a worse outcome might be that reviewers discover limitations that aren’t acknowledged in the paper\. The authors should use their best judgment and recognize that individual actions in favor of transparency play an important role in developing norms that preserve the integrity of the community\. Reviewers will be specifically instructed to not penalize honesty concerning limitations\.
11. 3\.Theory assumptions and proofs
12. Question: For each theoretical result, does the paper provide the full set of assumptions and a complete \(and correct\) proof?
13. Answer:\[N/A\]
14. Justification: The paper is empirical\. The “saturation regularity” \(Section[3](https://arxiv.org/html/2606.00129#S3)\) is stated explicitly as an empirical pattern, not a mathematical theorem; we do not claim a proof\. The boxed Statement of the Regularity in Section[3](https://arxiv.org/html/2606.00129#S3)writes out what would be the formal claim and the empirical pillars supporting it \(16 / 25 statistically significant negatives, 5 monotonic\-destruction families, unidirectional saturation transition, direct mechanism check\)\. The Residual Contribution Regularity in Section[4](https://arxiv.org/html/2606.00129#S4)is an empirical regression claim \(slopeβ^=\+0\.74\\hat\{\\beta\}=\+0\.74,p=0\.014p=0\.014,n=10n=10\), not a theorem\.
15. Guidelines: - •The answer\[N/A\]means that the paper does not include theoretical results\. - •All the theorems, formulas, and proofs in the paper should be numbered and cross\-referenced\. - •All assumptions should be clearly stated or referenced in the statement of any theorems\. - •The proofs can either appear in the main paper or the supplemental material, but if they appear in the supplemental material, the authors are encouraged to provide a short proof sketch to provide intuition\. - •Inversely, any informal proof provided in the core of the paper should be complemented by formal proofs provided in appendix or supplemental material\. - •Theorems and Lemmas that the proof relies upon should be properly referenced\.
16. 4\.Experimental result reproducibility
17. Question: Does the paper fully disclose all the information needed to reproduce the main experimental results of the paper to the extent that it affects the main claims and/or conclusions of the paper \(regardless of whether the code and data are provided or not\)?
18. Answer:\[Yes\]
19. Justification: The V\-axis extraction protocol \(story prompts, paraphrase generation, hidden\-state extraction, PCA\-PC1 with Joy\-positive orientation\) is documented in Section[5](https://arxiv.org/html/2606.00129#S5)and Appendix[S2](https://arxiv.org/html/2606.00129#A2)\. EEG model training \(architecture, optimisation, augmentation, KD, ensemble construction, exact seed list\) is in Appendix[S14](https://arxiv.org/html/2606.00129#A14)\. Statistical methods \(bootstrap, random\-direction null, paired tests\) are in Appendix[S15](https://arxiv.org/html/2606.00129#A15)\. Dataset splits \(subjects 0–79 / 80–99 / 100–122 train/val/test on FACED\) are in Appendix[S1](https://arxiv.org/html/2606.00129#A1)\. The full intervention table with seeds and JSON pointers is in Appendix[S10](https://arxiv.org/html/2606.00129#A10)\.
20. Guidelines: - •The answer\[N/A\]means that the paper does not include experiments\. - •If the paper includes experiments, a\[No\]answer to this question will not be perceived well by the reviewers: Making the paper reproducible is important, regardless of whether the code and data are provided or not\. - •If the contribution is a dataset and/or model, the authors should describe the steps taken to make their results reproducible or verifiable\. - •Depending on the contribution, reproducibility can be accomplished in various ways\. For example, if the contribution is a novel architecture, describing the architecture fully might suffice, or if the contribution is a specific model and empirical evaluation, it may be necessary to either make it possible for others to replicate the model with the same dataset, or provide access to the model\. In general\. releasing code and data is often one good way to accomplish this, but reproducibility can also be provided via detailed instructions for how to replicate the results, access to a hosted model \(e\.g\., in the case of a large language model\), releasing of a model checkpoint, or other means that are appropriate to the research performed\. - •While NeurIPS does not require releasing code, the conference does require all submissions to provide some reasonable avenue for reproducibility, which may depend on the nature of the contribution\. For example 1. \(a\)If the contribution is primarily a new algorithm, the paper should make it clear how to reproduce that algorithm\. 2. \(b\)If the contribution is primarily a new model architecture, the paper should describe the architecture clearly and fully\. 3. \(c\)If the contribution is a new model \(e\.g\., a large language model\), then there should either be a way to access this model for reproducing the results or a way to reproduce the model \(e\.g\., with an open\-source dataset or instructions for how to construct the dataset\)\. 4. \(d\)We recognize that reproducibility may be tricky in some cases, in which case authors are welcome to describe the particular way they provide for reproducibility\. In the case of closed\-source models, it may be that access to the model is limited in some way \(e\.g\., to registered users\), but it should be possible for other researchers to have some path to reproducing or verifying the results\.
21. 5\.Open access to data and code
22. Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material?
23. Answer:\[No\]
24. Justification: We do not release code or model checkpoints with this submission\. The two datasets used \(FACED\(Chenet al\.,[2023](https://arxiv.org/html/2606.00129#bib.bib5)\)and SEED\-V\(Liuet al\.,[2022b](https://arxiv.org/html/2606.00129#bib.bib99)\)\) are publicly available from their original authors under their published licences; we have not modified or re\-released them\. The full methodology, hyperparameters, optimisation details, augmentation schedule, exact seed list, and SLURM\-array recipe required to reproduce every result in the paper are documented in the appendix \(Appendix[S14](https://arxiv.org/html/2606.00129#A14),[S15](https://arxiv.org/html/2606.00129#A15),[S1](https://arxiv.org/html/2606.00129#A1),[S10](https://arxiv.org/html/2606.00129#A10),[S19](https://arxiv.org/html/2606.00129#A19)\)\. Code, configs, and the 10\-checkpoint ensemble checkpoints will be released upon acceptance\.
25. Guidelines: - •The answer\[N/A\]means that paper does not include experiments requiring code\. - • - •While we encourage the release of code and data, we understand that this might not be possible, so\[No\]is an acceptable answer\. Papers cannot be rejected simply for not including code, unless this is central to the contribution \(e\.g\., for a new open\-source benchmark\)\. - •The instructions should contain the exact command and environment needed to run to reproduce the results\. See the NeurIPS code and data submission guidelines \([https://neurips\.cc/public/guides/CodeSubmissionPolicy](https://neurips.cc/public/guides/CodeSubmissionPolicy)\) for more details\. - •The authors should provide instructions on data access and preparation, including how to access the raw data, preprocessed data, intermediate data, and generated data, etc\. - •The authors should provide scripts to reproduce all experimental results for the new proposed method and baselines\. If only a subset of experiments are reproducible, they should state which ones are omitted from the script and why\. - •At submission time, to preserve anonymity, the authors should release anonymized versions \(if applicable\)\. - •Providing as much information as possible in supplemental material \(appended to the paper\) is recommended, but including URLs to data and code is permitted\.
26. 6\.Experimental setting/details
27. Question: Does the paper specify all the training and test details \(e\.g\., data splits, hyperparameters, how they were chosen, type of optimizer\) necessary to understand the results?
28. Answer:\[Yes\]
29. Justification: Optimiser \(AdamW,lr=10−3\\mathrm\{lr\}=10^\{\-3\},β1=0\.9\\beta\_\{1\}=0\.9,β2=0\.999\\beta\_\{2\}=0\.999, weight decay10−210^\{\-2\}\), schedule \(cosine \+ 5\-epoch warmup\), batch size128128, training lengthe∈\{100,150\}e\\in\\\{100,150\\\}epochs, depthd∈\{3,6\}d\\in\\\{3,6\\\}, filter dimensionf=128f=128, seeds\{42,123,456,789,2025\}\\\{42,123,456,789,2025\\\}, augmentation parameters \(Gaussianσ=0\.05⋅std\\sigma=0\.05\\cdot\\mathrm\{std\}, channel dropoutp=0\.15p=0\.15, temporal masking≤5%\\leq 5\\%, all atp=0\.6p=0\.6\), KD \(λKD=0\.5\\lambda\_\{\\mathrm\{KD\}\}=0\.5,T=1\.0T=1\.0, rand9 9\-D orthonormal teacher\), and ensemble construction \(10 ckpts split as 5×\\timese=100e=100\+ 5×\\timese=150e=150, uniform softmax averaging\) are all specified in Appendix[S14](https://arxiv.org/html/2606.00129#A14)\.
30. Guidelines: - •The answer\[N/A\]means that the paper does not include experiments\. - •The experimental setting should be presented in the core of the paper to a level of detail that is necessary to appreciate the results and make sense of them\. - •The full details can be provided either with the code, in appendix, or as supplemental material\.
31. 7\.Experiment statistical significance
32. Question: Does the paper report error bars suitably and correctly defined or other appropriate information about the statistical significance of the experiments?
33. Answer:\[Yes\]
34. Justification: We report standard deviation across 5 seeds in every recipe\-cascade row \(Section[8](https://arxiv.org/html/2606.00129#S8)\); Pearsonrrwithpp\-values for every cross\-architecture and brain correlation \(Sections[6](https://arxiv.org/html/2606.00129#S6),[4](https://arxiv.org/html/2606.00129#S4)\); pairedtt\-tests for V\-axis interventions vs\. matched\-recipe baselines \(Section[3](https://arxiv.org/html/2606.00129#S3)\);B=10,000B=10\{,\}000subject\-resampled bootstrap CIs and Fisher\-zzpp\-values for cohort EEG correlations \(Appendix[S15](https://arxiv.org/html/2606.00129#A15)\); 200\-direction matched\-norm random\-direction null distributions for the cohort EEG–LLM correlation; 1000\-direction null for the cross\-architecture V\-axis correlation; matched\-direction null for the inference\-time directional ablation \(Section[4](https://arxiv.org/html/2606.00129#S4), meanz≈7\.7z\\approx 7\.7across the 10 SOTA\-pool checkpoints\)\.
35. Guidelines: - •The answer\[N/A\]means that the paper does not include experiments\. - •The authors should answer\[Yes\]if the results are accompanied by error bars, confidence intervals, or statistical significance tests, at least for the experiments that support the main claims of the paper\. - •The factors of variability that the error bars are capturing should be clearly stated \(for example, train/test split, initialization, random drawing of some parameter, or overall run with given experimental conditions\)\. - •The method for calculating the error bars should be explained \(closed form formula, call to a library function, bootstrap, etc\.\) - •The assumptions made should be given \(e\.g\., Normally distributed errors\)\. - •It should be clear whether the error bar is the standard deviation or the standard error of the mean\. - •It is OK to report 1\-sigma error bars, but one should state it\. The authors should preferably report a 2\-sigma error bar than state that they have a 96% CI, if the hypothesis of Normality of errors is not verified\. - •For asymmetric distributions, the authors should be careful not to show in tables or figures symmetric error bars that would yield results that are out of range \(e\.g\., negative error rates\)\. - •If error bars are reported in tables or plots, the authors should explain in the text how they were calculated and reference the corresponding figures or tables in the text\.
36. 8\.Experiments compute resources
37. Question: For each experiment, does the paper provide sufficient information on the computer resources \(type of compute workers, memory, time of execution\) needed to reproduce the experiments?
38. Answer:\[Yes\]
39. Justification: Appendix[S20](https://arxiv.org/html/2606.00129#A20)reports approximately4,5004\{,\}500V100 GPU\-hours total, broken down by experiment family:∼\\sim600 for the recipe ablation cascade,∼\\sim1,200 for the 25 V\-axis\-supervision interventions,∼\\sim800 for the 36\-checkpoint cross\-architecture analysis,∼\\sim1,200 for ensemble construction and val–test rank diagnostics,∼\\sim700 for cross\-dataset \(SEED\-V\) experiments\. The 10\-checkpoint ensemble SOTA result is reproducible in∼\\sim10 GPU\-hours per seed\. V\-axis extraction is CPU\-cheap \(∼\\sim10 minutes per LM on a single 80 GB GPU\)\. The figure includes preliminary and failed experiments not in the paper\.
40. Guidelines: - •The answer\[N/A\]means that the paper does not include experiments\. - •The paper should indicate the type of compute workers CPU or GPU, internal cluster, or cloud provider, including relevant memory and storage\. - •The paper should provide the amount of compute required for each of the individual experimental runs as well as estimate the total compute\. - •The paper should disclose whether the full research project required more compute than the experiments reported in the paper \(e\.g\., preliminary or failed experiments that didn’t make it into the paper\)\.
41. 9\.Code of ethics
43. Answer:\[Yes\]
44. Justification: We use only publicly released datasets \(FACED\(Chenet al\.,[2023](https://arxiv.org/html/2606.00129#bib.bib5)\), SEED\-V\(Liuet al\.,[2022b](https://arxiv.org/html/2606.00129#bib.bib99)\)\) collected by other groups under their respective IRB\-approved protocols\. We collect no new human data\. We use only publicly released LLMs under their published licences\.
45. Guidelines: - •The answer\[N/A\]means that the authors have not reviewed the NeurIPS Code of Ethics\. - •If the authors answer\[No\], they should explain the special circumstances that require a deviation from the Code of Ethics\. - •The authors should make sure to preserve anonymity \(e\.g\., if there is a special consideration due to laws or regulations in their jurisdiction\)\.
46. 10\.Broader impacts
47. Question: Does the paper discuss both potential positive societal impacts and negative societal impacts of the work performed?
48. Answer:\[Yes\]
49. Justification: Improved EEG emotion classifiers have applications in mental\-health monitoring and brain–computer interfaces \(positive\); they also pose risks of affective inference without consent in surveillance, workplace, or advertising contexts \(negative\)\. The V\-axis extraction protocol applied to EEG could in principle reduce the calibration\-data burden for affective inference systems — a feature that cuts both ways\. We do not release human EEG data at any stage; the planned camera\-ready code release covers training configs, feature extraction, and analysis only\. Full broader\-impacts statement in Appendix[S16](https://arxiv.org/html/2606.00129#A16)\.
50. Guidelines: - •The answer\[N/A\]means that there is no societal impact of the work performed\. - •If the authors answer\[N/A\]or\[No\], they should explain why their work has no societal impact or why the paper does not address societal impact\. - •Examples of negative societal impacts include potential malicious or unintended uses \(e\.g\., disinformation, generating fake profiles, surveillance\), fairness considerations \(e\.g\., deployment of technologies that could make decisions that unfairly impact specific groups\), privacy considerations, and security considerations\. - •The conference expects that many papers will be foundational research and not tied to particular applications, let alone deployments\. However, if there is a direct path to any negative applications, the authors should point it out\. For example, it is legitimate to point out that an improvement in the quality of generative models could be used to generate Deepfakes for disinformation\. On the other hand, it is not needed to point out that a generic algorithm for optimizing neural networks could enable people to train models that generate Deepfakes faster\. - •The authors should consider possible harms that could arise when the technology is being used as intended and functioning correctly, harms that could arise when the technology is being used as intended but gives incorrect results, and harms following from \(intentional or unintentional\) misuse of the technology\. - •If there are negative societal impacts, the authors could also discuss possible mitigation strategies \(e\.g\., gated release of models, providing defenses in addition to attacks, mechanisms for monitoring misuse, mechanisms to monitor how a system learns from feedback over time, improving the efficiency and accessibility of ML\)\.
51. 11\.Safeguards
52. Question: Does the paper describe safeguards that have been put in place for responsible release of data or models that have a high risk for misuse \(e\.g\., pre\-trained language models, image generators, or scraped datasets\)?
53. Answer:\[N/A\]
54. Justification: We release no high\-risk artefacts with this submission\. The planned camera\-ready release \(Appendix[S19](https://arxiv.org/html/2606.00129#A19)\) covers training configs, feature\-side and analysis code, and the trained EEG\-emotion checkpoints; it does not introduce abuse vectors beyond what is already public via FACED, SEED\-V, and the listed pre\-trained LLMs\. We release neither human EEG data nor LLM weights at any stage\. Dual\-use risk assessment and recommended deployment safeguards are in Appendix[S16](https://arxiv.org/html/2606.00129#A16)\.
55. Guidelines: - •The answer\[N/A\]means that the paper poses no such risks\. - •Released models that have a high risk for misuse or dual\-use should be released with necessary safeguards to allow for controlled use of the model, for example by requiring that users adhere to usage guidelines or restrictions to access the model or implementing safety filters\. - •Datasets that have been scraped from the Internet could pose safety risks\. The authors should describe how they avoided releasing unsafe images\. - •We recognize that providing effective safeguards is challenging, and many papers do not require this, but we encourage authors to take this into account and make a best faith effort\.
56. 12\.Licenses for existing assets
57. Question: Are the creators or original owners of assets \(e\.g\., code, data, models\), used in the paper, properly credited and are the license and terms of use explicitly mentioned and properly respected?
58. Answer:\[Yes\]
59. Justification: FACED is cited\(Chenet al\.,[2023](https://arxiv.org/html/2606.00129#bib.bib5)\)and used under its public release license; SEED\-V is cited\(Liuet al\.,[2022b](https://arxiv.org/html/2606.00129#bib.bib99)\)\. All 14 LLMs \(Qwen3 family of six dense sizes from 0\.6B to 32B, Llama\-4\-Scout\-17B, Mistral\-7B, Phi\-2\-2\.7B, Pythia\-1\.4B, BLOOM\-560M, TinyLlama\-1\.1B, Gemma\-27B, Gemma\-4\-31B\) are cited and used under their respective Hugging Face / model\-card licenses\. The CBraMod\(Wanget al\.,[2025](https://arxiv.org/html/2606.00129#bib.bib3)\)and EMOD\(Chenet al\.,[2025](https://arxiv.org/html/2606.00129#bib.bib101)\)backbones are cited and used under their public licenses\. CLIP\(Radfordet al\.,[2021](https://arxiv.org/html/2606.00129#bib.bib94)\)is cited\.
60. Guidelines: - •The answer\[N/A\]means that the paper does not use existing assets\. - •The authors should cite the original paper that produced the code package or dataset\. - •The authors should state which version of the asset is used and, if possible, include a URL\. - •The name of the license \(e\.g\., CC\-BY 4\.0\) should be included for each asset\. - •For scraped data from a particular source \(e\.g\., website\), the copyright and terms of service of that source should be provided\. - •If assets are released, the license, copyright information, and terms of use in the package should be provided\. For popular datasets,[paperswithcode\.com/datasets](https://arxiv.org/html/2606.00129v1/paperswithcode.com/datasets)has curated licenses for some datasets\. Their licensing guide can help determine the license of a dataset\. - •For existing datasets that are re\-packaged, both the original license and the license of the derived asset \(if it has changed\) should be provided\. - •If this information is not available online, the authors are encouraged to reach out to the asset’s creators\.
61. 13\.New assets
62. Question: Are new assets introduced in the paper well documented and is the documentation provided alongside the assets?
63. Answer:\[No\]
64. Justification: We do not release new assets with this submission\. The V\-axis extraction scripts, V\-axis\-aligned EEG checkpoints, and figure\-generation pipelines will be released upon acceptance with documentation\. Within the paper, the V\-axis itself — nine FACED\-class centroids and the PCA\-PC1 direction at the LM’s penultimate layer — is fully specified by the protocol in Appendix[S2](https://arxiv.org/html/2606.00129#A2)\(model identities, layer indices, story prompts, paraphrase parameters\), so any reader can re\-derive the V\-axis without our checkpoint files\.
65. Guidelines: - •The answer\[N/A\]means that the paper does not release new assets\. - •Researchers should communicate the details of the dataset/code/model as part of their submissions via structured templates\. This includes details about training, license, limitations, etc\. - •The paper should discuss whether and how consent was obtained from people whose asset is used\. - •At submission time, remember to anonymize your assets \(if applicable\)\. You can either create an anonymized URL or include an anonymized zip file\.
66. 14\.Crowdsourcing and research with human subjects
67. Question: For crowdsourcing experiments and research with human subjects, does the paper include the full text of instructions given to participants and screenshots, if applicable, as well as details about compensation \(if any\)?
68. Answer:\[N/A\]
69. Justification: We do not collect new human data; all human EEG comes from the publicly released FACED\(Chenet al\.,[2023](https://arxiv.org/html/2606.00129#bib.bib5)\)and SEED\-V\(Liuet al\.,[2022b](https://arxiv.org/html/2606.00129#bib.bib99)\)datasets, originally collected by their respective groups under their own IRB\-approved protocols\.
70. Guidelines: - •The answer\[N/A\]means that the paper does not involve crowdsourcing nor research with human subjects\. - •Including this information in the supplemental material is fine, but if the main contribution of the paper involves human subjects, then as much detail as possible should be included in the main paper\. - •According to the NeurIPS Code of Ethics, workers involved in data collection, curation, or other labor should be paid at least the minimum wage in the country of the data collector\.
71. 15\.Institutional review board \(IRB\) approvals or equivalent for research with human subjects
72. Question: Does the paper describe potential risks incurred by study participants, whether such risks were disclosed to the subjects, and whether Institutional Review Board \(IRB\) approvals \(or an equivalent approval/review based on the requirements of your country or institution\) were obtained?
73. Answer:\[N/A\]
74. Justification: No new human\-subjects research was conducted\. The FACED\(Chenet al\.,[2023](https://arxiv.org/html/2606.00129#bib.bib5)\)and SEED\-V\(Liuet al\.,[2022b](https://arxiv.org/html/2606.00129#bib.bib99)\)datasets were collected under their original IRB approvals, which we cite\.
75. Guidelines: - •The answer\[N/A\]means that the paper does not involve crowdsourcing nor research with human subjects\. - •Depending on the country in which research is conducted, IRB approval \(or equivalent\) may be required for any human subjects research\. If you obtained IRB approval, you should clearly state this in the paper\. - •We recognize that the procedures for this may vary significantly between institutions and locations, and we expect authors to adhere to the NeurIPS Code of Ethics and the guidelines for their institution\. - •For initial submissions, do not include any information that would break anonymity \(if applicable\), such as the institution conducting the review\.
76. 16\.Declaration of LLM usage
77. Question: Does the paper describe the usage of LLMs if it is an important, original, or non\-standard component of the core methods in this research? Note that if the LLM is used only for writing, editing, or formatting purposes and does*not*impact the core methodology, scientific rigor, or originality of the research, declaration is not required\.
78. Answer:\[Yes\]
79. Justification: LLMs are central to the V\-axis extraction protocol\. The 14 language models from which we extract the V\-axis are named in Section[5](https://arxiv.org/html/2606.00129#S5)and Appendix[S2](https://arxiv.org/html/2606.00129#A2): Qwen3 at all six dense sizes \(lead model Qwen3\-4B; also 0\.6B/1\.7B/8B/14B/32B\), Llama\-4\-Scout\-17B, Mistral\-7B, Phi\-2\-2\.7B, Pythia\-1\.4B, BLOOM\-560M, TinyLlama\-1\.1B, Gemma\-27B, Gemma\-4\-31B\. The penultimate\-layer hidden state \(L=Lmax−1L=L\_\{\\max\}\-1, e\.g\.,L=35L=35of3636for Qwen3\-4B\) is the source of all class centroids\. We also use a generic instruction\-tuned Qwen LM \(temperature 0\.7\) as a paraphrase generator in pre\-processing; the prompt is in Appendix[S2](https://arxiv.org/html/2606.00129#A2)\. CLIP\(Radfordet al\.,[2021](https://arxiv.org/html/2606.00129#bib.bib94)\)\(image encoder\) is used for the cross\-modal vision V\-axis\.
80. Guidelines: - •The answer\[N/A\]means that the core method development in this research does not involve LLMs as any important, original, or non\-standard components\. - •Please refer to our LLM policy in the NeurIPS handbook for what should or should not be described\.

## Supplementary Material

## Appendix S1Datasets, Splits, and Preprocessing

### FACED\.

123 subjects, 28 emotional video clips covering 9 emotion classes\. The test split \(subjects 100–122\) yields23×28=64423\\times 28=644*trial*pairs; trials are further sub\-segmented into33\-second windows \(3 windows per stimulus, plus boundary windows\) producing1,9321\{,\}932test instances used in the confusion matrices and the cross\-arch convergence analysis\. The cohort EEG ridge of §[6](https://arxiv.org/html/2606.00129#S6)aggregates back to the2828\-stimulus level\. Original FACED preprocessing detail follows below: 9 emotion categories \(Anger, Disgust, Fear, Sadness, Neutral, Amusement, Inspiration, Joy, Tenderness\) at 3–4 stimuli per class\. EEG: 32 channels, 250 Hz sampling, 30\-second clip duration\. We use the official preprocessing protocol distributed with the dataset release\(Chenet al\.,[2023](https://arxiv.org/html/2606.00129#bib.bib5)\): bandpass 0\.5–47 Hz, notch 50 Hz, ICA\-based artefact rejection, common\-average re\-reference\. Train / validation / test split: subjects 0–79 / 80–99 / 100–122 \(no subject overlap\), matching the protocol used by CBraMod\(Wanget al\.,[2025](https://arxiv.org/html/2606.00129#bib.bib3)\)and EMOD\(Chenet al\.,[2025](https://arxiv.org/html/2606.00129#bib.bib101)\)for fair comparison\.

### SEED\-V\.

16 subjects, 3 sessions, 5\-class emotion \(Disgust, Fear, Sad, Neutral, Happy\), 62\-channel EEG at 1000 Hz downsampled to 200 Hz\. Standard SEED\-V split with subject\-disjoint train / test\. We replicate the SEED\-V CBraMod recipe atd=6d\{=\}6for the cross\-architecture generality and five\-claim re\-derivation reported in Appendix[S8](https://arxiv.org/html/2606.00129#A8)\.

### DE features\.

For each \(subject, stimulus, channel\) we compute differential entropy in five canonical bands \(δ\\delta0\.5–4,θ\\theta4–8,α\\alpha8–13,β\\beta13–30,γ\\gamma30–47 Hz\) per second over the 30\-second clip, then mean over time to obtain a32×532\\times 5feature matrix per \(subject, stim\) pair\. Cohort\-level analyses average across the 123 subjects to give the28×32×528\\times 32\\times 5tensor used in Section[6](https://arxiv.org/html/2606.00129#S6)\.

## Appendix S2V\-Axis Extraction Protocol

### Story prompts\.

Nine emotion\-evocative stories \(one per FACED class\), each 1–3 sentences\. Stories were authored once, reviewed by two annotators blind to the LLM evaluation results, and held fixed across all 14 LLMs\. Verbatim text:

- •Anger\.A driver cuts you off in heavy traffic and laughs through the window\. Your hands tighten on the wheel and you feel heat rise in your chest\.
- •Disgust\.You open the fridge and find a container of leftovers covered in green fuzz, with a sour smell that makes you step back\.
- •Fear\.Walking home alone at night, you hear footsteps quicken behind you and notice the streetlights are out\.
- •Sadness\.You sort through old photographs of a friend who died last year and realise you can no longer remember the sound of their voice\.
- •Neutral\.You wait in line at the post office, glancing at the clock and watching the queue slowly advance one customer at a time\.
- •Amusement\.A puppy sneezes so hard it tumbles backwards and stares at the floor in confusion before sneezing again\.
- •Inspiration\.A dancer with one leg performs a flawless routine to a standing ovation, demonstrating that constraint can become craft\.
- •Joy\.You receive an unexpected acceptance letter from a programme you applied to months ago and forgot about\.
- •Tenderness\.A grandparent gently brushes a sleeping child’s hair away from their forehead, careful not to wake them\.

Image versions used for the CLIP\-image V\-axis \(Appendix[S5](https://arxiv.org/html/2606.00129#A5)\) are nine standard affective stimuli matched to the same nine class labels \(full URLs and licences in the released code\)\.

### Paraphrase generation\.

For each story, we generaten=50n=50paraphrases via independent calls to a generic instruction\-tuned Qwen LLM \(temperature 0\.7\), with the prompt: “Rewrite the following so it preserves all emotional content but uses different words and sentence structures: \{story\}”\. Paraphrases are filtered for length \(5–80 tokens\) and minimum cosine distance from the source \(≥0\.10\\geq 0\.10on Sentence\-BERT embeddings\)\.

### Hidden\-state extraction\.

Each paraphrase is tokenised and passed through the target LM\. We take the last\-token hidden state at layerLL\(default:Lmax−1L\_\{\\max\}\-1, the penultimate layer; for Qwen3\-4B this isL=35L=35of 36\)\. Class centroids areck=150​∑i=150hk,i\(L\)c\_\{k\}=\\frac\{1\}\{50\}\\sum\_\{i=1\}^\{50\}h\_\{k,i\}^\{\(L\)\}fork=1,…,9k=1,\\dots,9\.

### V\-axis as PCA\-PC1\.

Stack the nine centroids into a9×D9\\times Dmatrix and run mean\-centred PCA\. The V\-axis is the unit vector along PC1, oriented so that⟨cJoy,v⟩\>0\\langle c\_\{\\mathrm\{Joy\}\},v\\rangle\>0\. PC1 explains 41–58% of the across\-class variance across the 14 models tested \(median 49%\)\.

### Robustness\.

Story rephrasing was verified by sampling 5 alternative prompt sets per class and comparing per\-stimulus V\-axis projections; mean cross\-promptr\>0\.93r\>0\.93across 14 models\. Paraphrase count was verified atn∈\{1,5,25,50\}n\\in\\\{1,5,25,50\\\}; SST\-2 zero\-shot AUC0\.740\.74atn=1n=1, rising to the canonical0\.8320\.832atn=50n=50for Qwen3\-4B\.

### Few\-shot data\-efficiency curve\.

The fullnn\-shot sweep on Qwen3\-4B is in Table[3](https://arxiv.org/html/2606.00129#A2.T3)\. The V\-axis matches the supervised logistic regression atn=15n\{=\}15paraphrases per class \(AUC=0\.831\\mathrm\{AUC\}=0\.831vs supervised0\.8370\.837at5,0005\{,\}000labels\) — a∼\\sim30×30\{\\times\}data\-efficiency gap when measured in training\-instance count \(15×9=13515\\times 9=135generated paraphrases vs5,0005\{,\}000labelled SST\-2 examples\)\. Atn=1n\{=\}1\(single sentence per class,99total\), AUC is already0\.7400\.740\.

Table 3:V\-axis few\-shot data\-efficiency curve\. Standard deviations \(5 seeds\):±0\.043,0\.014,0\.046,0\.029,0\.010,0\.008,0\.000\\pm 0\.043,0\.014,0\.046,0\.029,0\.010,0\.008,0\.000fromn=1n\{=\}1ton=50n\{=\}50\. Crossover with the5,0005\{,\}000\-label supervised logistic regression \(AUC=0\.837\\mathrm\{AUC\}=0\.837\) occurs atn=15n\{=\}15\.
### Supervised\-LR equivalence anchor\.

A complementary anchor: on the same SST\-2 features at the lead\-model penultimate layer, a supervised logistic regression reaches AUC0\.6800\.680atN=10N\{=\}10labels,0\.8780\.878atN=100N\{=\}100, and0\.9510\.951at the fullN=5,000N\{=\}5\{,\}000training set\. The V\-axis atn=50n\{=\}50paraphrases \(0\.8320\.832\) lies betweenN=80N\{=\}80andN=100N\{=\}100supervised labels in this sweep — consistent with the data\-efficiency claim\.

## Appendix S3Per\-LLM Cross\-Architecture Convergence: Deep Dive

### Sentiment benchmark table\.

Table[4](https://arxiv.org/html/2606.00129#A3.T4)consolidates the zero\-shot V\-axis projection scores across five sentiment corpora and contrasts them against \(i\) a same\-corpus 5k\-example supervised logistic\-regression upper bound on SST\-2 and \(ii\) two zero\-shot reference baselines \(SBERT prototype\-cosine and a355355M RoBERTa\-large\-MNLI entailment classifier\)\. The takeaway is that nine LLM stories sit above the generic\-encoder baseline and within0\.080\.08of an NLI\-supervised model that is two orders of magnitude larger in supervised exposure\.

Table 4:Zero\-shot V\-axis sentiment results\. SST\-2 \(0\.8320\.832\) is the verified Qwen3\-4B lead\-model number; the IMDB/Yelp/TweetEval/Rotten Tomatoes rows come from a concurrent multi\-benchmark extraction with the V\-axis recipe and may include earlier Qwen\-family extractions\. SST\-2 alone is within0\.0050\.005of a 5k\-sample supervised LR baseline \(0\.8370\.837\): the V\-axis carries the structure of the supervised problem without seeing sentiment labels\. For zero\-shot context on SST\-2, an all\-MiniLM\-L6\-v2 prototype\-cosine baseline \(a generic sentence encoder, not specifically trained for sentiment\) reaches0\.7930\.793, while RoBERTa\-large\-MNLI used as a zero\-shot entailment classifier \(a355355M\-parameter model fine\-tuned on a large supervised NLI corpus\) reaches0\.9120\.912\. Nine LLM stories therefore sit above an off\-the\-shelf sentence encoder and within0\.080\.08of a 355M\-param NLI\-supervised model on the same zero\-shot evaluation\.
### LLM\-as\-judge baseline \(steelman\)\.

A reviewer\-natural question is whether prompting the same LLM directly as a sentiment classifier \(“LLM\-as\-judge”\) would beat the V\-axis projection\. We test this with Qwen3\.5\-2B as the judge on SST\-2: a minimal binary prompt reaches AUC0\.9510\.951at165165s/11k examples, and a 5\-shot in\-context\-learning variant reaches0\.9570\.957at253253s/11k\. The V\-axis projection trails on raw quality \(0\.8320\.832\) but is∼\\sim4×4\{\\times\}faster \(≤50\\leq 50s/11k on the same hardware\) and exposes a single re\-usable direction in feature space \(composable with downstream losses, ablatable, transferable across modalities; §[6](https://arxiv.org/html/2606.00129#S6), §[4](https://arxiv.org/html/2606.00129#S4)\)\. The competitive advantage is not raw zero\-shot accuracy — it is interpretability and re\-use, which the LLM\-as\-judge baseline does not provide\.

### Three regimes against behavioural valence\.

The 14\-LLM universality is not uniform: models cluster into three regimes by per\-stim correlation against a behavioural valence reference \(n=28n=28FACED stim, mean of 123 subjects’ valence ratings\)\.

- •Top tier\(per\-stimr\>0\.85r\>0\.85,n=7n=7\): all six Qwen3 sizes \(Qwen3\-14Br=0\.944r=0\.944, Qwen3\-4B0\.9440\.944, Qwen3\-8B0\.9410\.941, Qwen3\-32B0\.9370\.937, Qwen3\-0\.6B0\.9330\.933, Qwen3\-1\.7B0\.9300\.930\) plus Mistral\-7B \(0\.9350\.935\)\. This is the*modern\-LLM manifold*; within Qwen3 it is exceptionally tight \(meanrbehav=0\.938±0\.005r\_\{\\mathrm\{behav\}\}=0\.938\\pm 0\.005, meanreeg=0\.796±0\.011r\_\{\\mathrm\{eeg\}\}=0\.796\\pm 0\.011\)\.
- •Middle tier\(0\.6≤r≤0\.850\.6\\leq r\\leq 0\.85,n=4n=4\): Llama\-4\-Scout\-17B \(0\.8260\.826\), Gemma\-27B \(0\.8070\.807\), Gemma\-4\-31B \(0\.7150\.715\), Phi\-2\-2\.7B \(0\.6510\.651\)\.
- •Out\-of\-manifold\(r<0\.4r<0\.4,n=3n=3\): Pythia\-1\.4B \(0\.2760\.276\), TinyLlama\-1\.1B \(0\.1280\.128\), BLOOM\-560M \(0\.0310\.031\)\.

The within\-Qwen3 scaling is essentially flat \(r=\+0\.568r=\+0\.568betweenlog⁡\(hidden​dim\)\\log\(\\mathrm\{hidden\\,dim\}\)and per\-stimrbehavr\_\{\\mathrm\{behav\}\},p=0\.24p=0\.24,n=6n=6\): once a model has joined the modern\-LLM manifold, convergence is*not*a size\-monotonic phenomenon\. The Qwen3 familyreegr\_\{\\mathrm\{eeg\}\}range is\[\+0\.781,\+0\.812\]\[\+0\.781,\+0\.812\]with standard deviation±0\.011\\pm 0\.011, and the behaviourally best\-aligned models are mid\-size Qwen3 \(4B–14B\)\.

### Within\-family vs cross\-family decomposition\.

The “Universal=Qwen3\-only” reading is the obvious reviewer pushback, so we decompose the1414\-LLM agreement matrix explicitly\. Within Qwen3 \(six sizes,C​\(6,2\)=15C\(6,2\)=15pairs\), V\-axis cosine isr¯=\+0\.995±0\.003\\overline\{r\}=\+0\.995\\pm 0\.003— essentially perfect alignment\. Across families \(Qwen3×\\timesnon\-Qwen,4848pairs\),r¯=\+0\.585±0\.289\\overline\{r\}=\+0\.585\\pm 0\.289— the wider spread is dominated by the three out\-of\-manifold models\. The Qwen3×\\timesMistral subset \(66pairs\) givesr¯=\+0\.939\\overline\{r\}=\+0\.939, and a full Procrustes alignment over the four top\-tier non\-Qwen models reaches mean pairwise cosine0\.9610\.961— functional V\-axis agreement is near\-perfect once the modern\-LLM manifold is reached, regardless of training family\.

### V\-shape across architectures\.

The same V\-shape emerges across five LM families \(Qwen, Pythia, BLOOM, Llama\-4\-Scout, Phi\-2\), and across bidirectional encoders BERT, RoBERTa, and DeBERTa, with the peak at layerLmax−1L\_\{\\max\}\-1in each case\. The cosine between input\-layer V\-axis and final\-layer V\-axis is only0\.2910\.291, with mid\-layer L14 at0\.0670\.067\. The Gemma\-4 family is the lone outlier: text V\-axis peaks atL=1L=1rather than the penultimate layer; brain alignment survives at mid\-layers \(Gemma\-4\-E2B atL=17L\{=\}17predicts EEG atr=0\.819r=0\.819; Gemma\-4\-E4B atL=21L\{=\}21atr=0\.714r=0\.714\)\.

## Appendix S4Concept Library and Compositionality

### Compositional axis arithmetic\.

Combining the V\-axis of a stylistic concept \(politeness\) with that of an emotional concept \(happiness\) predicts44\-quadrant categorical labels at 79% accuracy on a held\-out generation set,3\.16×3\.16\\timesabove chance, with uniform per\-cell precision in the\[0\.68,0\.86\]\[0\.68,0\.86\]range\. Subtractive orthogonalisation cleanly separates the two: happy\-AUC0\.912→0\.9880\.912\\to 0\.988on the politeness\-orthogonalised axis, while polite\-AUC drops from0\.793→0\.5600\.793\\to 0\.560\. Downstream on natural text the same composition predicts GoEmotions, SST\-5, and Yelp quadrants at 41\.8%–46\.7% \(documented scope refinement\)\.

### Recipe generality across 20 concepts\.

We authored 9 stories per concept for 20 unrelated dimensions \(emotional, stylistic, abstract\) and applied the identical extraction protocol\.20 of 20 concepts reach binary AUC\>0\.65\>0\.65on held\-out test stories, 17 of 20 reach≥0\.95\\geq 0\.95, 11 are perfect at1\.001\.00\(Figure[5](https://arxiv.org/html/2606.00129#A4.F5)\)\.

![Refer to caption](https://arxiv.org/html/2606.00129v1/x5.png)Figure 5:Binary pole\-vs\-pole AUC across 20 unrelated concepts \(sorted descending\), plus the out\-of\-scope toxicity case \(Jigsaw, AUC0\.590\.59\)\.
### Cross\-concept transfer matrix\.

The20×2020\\\!\\times\\\!20matrix has self\-AUC mean0\.970\.97on the diagonal and off\-diagonal mean0\.830\.83\(gap\+0\.13\+0\.13\): the recipe extracts*concept\-specific*directions rather than a single global affect axis \(Figure[6](https://arxiv.org/html/2606.00129#A4.F6)\)\. Most general source axes: sarcasm \(0\.890\.89\), fear \(0\.880\.88\), envy and shame \(0\.870\.87\)\. Most specific: decisiveness \(0\.780\.78\), complexity \(0\.780\.78\)\. Most captured: urgency \(0\.970\.97\), fear \(0\.950\.95\), concreteness \(0\.940\.94\)\. Most concept\-specific: specificity \(0\.650\.65\), sarcasm \(0\.670\.67\)\.

![Refer to caption](https://arxiv.org/html/2606.00129v1/x6.png)Figure 6:Cross\-concept transfer matrix \(best\-orientation AUC\)\. Diagonal: self\-classification \(white\-bordered cells\)\. Off\-diagonal: source\-axis\-applied\-to\-target\-benchmark AUC\. Sorted top\-to\-bottom by mean off\-diagonal\. Mean diagonal0\.970\.97, mean off\-diagonal0\.830\.83, gap\+0\.13\+0\.13\.
### Toxicity: out\-of\-scope\.

Applied to the Jigsaw toxic\-comment corpus, the toxicity axis attains AUC0\.590\.59\. Two non\-mutually\-exclusive hypotheses: \(i\) toxicity is more distributed in residual\-stream geometry than a single late\-layer PC1 captures \(low\-rankk\>1k\>1subspace\); \(ii\) Qwen\-generated toxic stories, filtered through alignment training, may be too mild to span the real Jigsaw distribution\. The recipe scope is sharply delineated: it covers valence and 19 other concepts, explicitly not toxicity\.

## Appendix S5Multilingual and Vision Generality

### Multilingual transfer\.

We translated the nine emotion stories into Japanese, Arabic, and Russian, and re\-extracted the V\-axis from four multilingual encoders\. Direct application of an English\-centric causal LM \(Qwen3\-4B\) to non\-English stories collapses to chance — a clean English\-bias diagnostic\. Switching to mT0\-base\(Muennighoffet al\.,[2023](https://arxiv.org/html/2606.00129#bib.bib117)\)\(277M\-param multilingual encoder–decoder\) fully recovers the SST\-2 result and*exceeds*the English baseline in every language tested \(Figure[7](https://arxiv.org/html/2606.00129#A5.F7)\)\.

![Refer to caption](https://arxiv.org/html/2606.00129v1/x7.png)Figure 7:Cross\-lingual V\-axis recovery\. Direct causal\-Qwen transfer collapses \(English\-bias\); mT0\-base recovers and exceeds the English ceiling at5×5\\\!\\timessmaller parameter count across three typologically distant languages\.Per\-language V\-axis cosines \(computed on Spanish/French extractions using the same protocol\) are moderate \(ren​–​es=0\.47r\_\{\\mathrm\{en\\text\{\-\-\}es\}\}=0\.47,ren​–​fr=0\.16r\_\{\\mathrm\{en\\text\{\-\-\}fr\}\}=0\.16,res​–​fr=0\.46r\_\{\\mathrm\{es\\text\{\-\-\}fr\}\}=0\.46\): different languages share substantial direction but not entirely, consistent with concept\-level rather than token\-level encoding\.

### Cross\-modal generality \(CLIP vision\)\.

The same protocol applied to nine emotion\-evocative*images*via CLIP\(Radfordet al\.,[2021](https://arxiv.org/html/2606.00129#bib.bib94)\)yields a V\-axis that ranks the OASIS image dataset\(Kurdiet al\.,[2017](https://arxiv.org/html/2606.00129#bib.bib116)\)on valence at Pearsonr=0\.869r=0\.869and on arousal atr=0\.803r=0\.803\. Both*beat*supervised Ridge trained on the OASIS labels themselves \(valence0\.8360\.836, arousal0\.7890\.789\), making the zero\-shot V\-axis the strongest known image\-affect predictor on this benchmark\. The valence and arousal CLIP axes are essentially orthogonal \(\|cos⁡θ\|=0\.013\|\\cos\\theta\|=0\.013\)\.

## Appendix S6Specificity Controls \(Nonce, Random, Arousal\)

### Nonce\-word ablation\.

Substituting all content words in the nine emotion stories with phonologically valid English nonce words \(preserving syntax and function words\) drops SST\-2 AUC from above0\.830\.83to chance \(≈0\.52\\approx 0\.52\) — the second decimal\. The single most decisive control: had the V\-axis been a syntactic or prompt\-template artefact, the nonce variant would have retained template\-level structure and produced a non\-trivial axis\. The signal is encoded in semantic content\.

### Random\-direction baseline\.

Random Gaussian directions in the LM’s residual stream,L2L\_\{2\}\-matched to the V\-axis, give SST\-2 AUC≈0\.51\\approx 0\.51and lexicon\|r\|≈0\.10\|r\|\\approx 0\.10at every benchmark\. Bootstrap 95% CIs are tight: SST\-2 AUC\[0\.844,0\.890\]\[0\.844,0\.890\], EmoBank V\|r\|∈\[0\.485,0\.512\]\|r\|\\in\[0\.485,0\.512\], Hu & Liu\|r\|∈\[0\.740,0\.776\]\|r\|\\in\[0\.740,0\.776\]\(1,000\-sample bootstrap; statistical methods in Appendix[S15](https://arxiv.org/html/2606.00129#A15)\)\.

### Arousal asymmetry: where the recipe fails in text\.

Table[5](https://arxiv.org/html/2606.00129#A6.T5)side\-by\-sides the same nine\-story PCA recipe instantiated on a valence axis \(V\-axis, this paper\) versus an arousal axis \(A\-axis, identical extraction protocol with the nine\-story prompts re\-anchored on arousal poles\)\. If the recipe were content\-blind — merely a high\-capacity dimensionality\-reduction trick that produces a graded affective scale from any pole\-vs\-pole prompt set — the V and A columns should agree across the five textual rows\. They do not: V transfers cleanly across SST\-2, EmoBank, Warriner, NRC, and OASIS, while A collapses to near\-chance on every textual benchmark\. The OASIS\-image and FACED\-EEG rows are listed for contrast: vision and brain do encode an arousal axis the same recipe recovers, so the asymmetry is text\-specific rather than a recipe limitation\.

Table 5:Valence transfers across modalities; arousal does not in text but does in vision\. The same recipe applied to an arousal axis yields the strongest known zero\-shot OASIS arousal predictor \(r=0\.803r=0\.803, beating supervised ridge0\.7890\.789\) but collapses on all text benchmarks\. Cross\-language A\-axes are essentially orthogonal\.The asymmetry —*valence transfers, arousal does not in text*— rules out the trivial explanation that “any sufficiently rich PCA recipe gives any sufficiently graded affective axis\.” Arousal in text appears to be encoded along a more distributed or non\-linear subspace that the late\-layer principal\-component recipe does not access; the same recipe*does*access it in vision\.

## Appendix S7Brain\-Side Deep Dive

### Per\-stimulus reliability\.

Per\-stimulus split\-half reliability \(odd/even subjects,n≈62n\{\\approx\}62each\) isr=0\.988r=0\.988, indicating a clean stimulus\-level signal\. Trial\-level correlations \(n=3,444n=3\{,\}444subject–stim pairs\) range fromr=0\.17r=0\.17–0\.210\.21\(p<10−23p<10^\{\-23\}\); class\-level ranking \(n=9n=9centroids\) hitsr=\+0\.886r=\+0\.886vs\. behavioural valence\. The cohortr=0\.87r=0\.87headline is reliability\-bounded\.

### 14\-LLM brain forest\.

Per\-LLM brain\-prediction quality across all 14 models is shown in Figure[8](https://arxiv.org/html/2606.00129#A7.F8): the top\-tier 13 LLMs all clear their per\-LLM random\-direction nulls, while the out\-tier sits at the median\.

![Refer to caption](https://arxiv.org/html/2606.00129v1/x8.png)Figure 8:Per\-LLM brain\-prediction quality\. Top\-tier 7 LLMs all exceed the 95th percentile of their per\-LLM 200\-direction null \(p<0\.05p<0\.05each\); out\-tier 3 LLMs sit at the median or below\. Restricted to top\-tier the log\-hidden\-dim→r\\to rscaling isr=\+0\.125r=\+0\.125,p=0\.684p=0\.684; within\-Qwen3r=\+0\.568r=\+0\.568,p=0\.24p=0\.24,n=6n=6\. Family\-membership≫\\ggscale\.
### Time\-locked dynamics\.

Mid\-lateα\\alpha/β\\betapeak att=18t=18–2121s coincides with the LPP / sustained\-attention window of Hajcak & Foti\(Hajcaket al\.,[2010](https://arxiv.org/html/2606.00129#bib.bib121)\)and Codispoti et al\.\(Codispotiet al\.,[2023](https://arxiv.org/html/2606.00129#bib.bib122)\); earlyγ\\gammapeak att=3t=3–66s coincides with Müller’s affective\-picture window\(Mülleret al\.,[1999](https://arxiv.org/html/2606.00129#bib.bib123)\)\. Per\-second cohort\|r\|\|r\|in the alpha band arrives att≈21t\\approx 21s, in beta att≈18t\\approx 18s, in gamma att≈3t\\approx 3s; best individual channel\|r\|=0\.68\|r\|=0\.68att=21t=21s\.

### Per\-subject peak heterogeneity\.

Standard deviation of per\-subject alpha peak times isσ≈9\\sigma\\approx 9s; only 17–24% of the 123 subjects have their individual alpha or beta peak inside the cohort 18–21 s window\. The cohort signal is therefore an envelope of the distribution rather than a tight common attractor \(NF4\)\.

### Gemma\-4 boundary check\.

Brain alignment survives the text\-domainL=1L=1peak anomaly: Gemma\-4\-E2B atL=17L=17predicts EEG atr=0\.819r=0\.819\(p≈10−7p\\approx 10^\{\-7\}\), Gemma\-4\-E4B atL=21L=21atr=0\.714r=0\.714\(p≈2×10−5p\\approx 2\\times 10^\{\-5\}\)\. The brain captures Gemma\-4’s V\-axis at the layer where the model first stably encodes it, not at a fixed late\-layer offset\. Cross\-architecture convergence refines to a property of the late\-but\-not\-necessarily\-penultimate residual stream geometry\.

## Appendix S8SEED\-V Replication Full Numerics

We re\-test all five FACED claims with a SEED\-V\-derived V\-axis\. To remove FACED leakage, we re\-build the V\-axis from scratch using the same Section[5](https://arxiv.org/html/2606.00129#S5)protocol applied to the 5 SEED\-V emotion classes \(Qwen3\-4B atL=35L\{=\}35, 50 stories per class, PCA on the 5 centred centroids\)\.

### Step 1: SEED\-V cohort EEG–LLM alignment\.

We compute DE features per \(channel, band, second\) across all 16 subjects and aggregate to a45×62×545\\times 62\\times 5cohort tensor\. For each\(channel,band\)\(\\text\{channel\},\\text\{band\}\)cell we compute Pearson with the SEED\-V V\-axis projection\.

- •Best cohort cell: P1 /θ\\theta,r=\+0\.6159r=\+0\.6159\.
- •Region\-mean\|r\|\|r\|: occipital0\.3290\.329, parietal0\.3470\.347, central0\.3240\.324, frontal0\.2530\.253\.
- •Posterior dominance: occipital−\-frontal=\+0\.076=\+0\.076\.
- •Random\-direction null\(200 directions matched inL2L\_\{2\}\): empirical\|r\|max=0\.6159\|r\|\_\{\\max\}=0\.6159vs\. null 95th percentile=0\.478=0\.478,p<0\.005p<0\.005\.

### Step 2: SEED\-V brain topography\.

The Davidson FAA probe \(positive vs\. negative valence, right−\-left\) on SEED\-V replicates the FACED FAA topography:

Table 6:Davidson FAA on SEED\-V\. Three of four pairs are negative \(opposite to the strictly positive FACED pattern of Appendix[S9](https://arxiv.org/html/2606.00129#A9)\); only F8–F7 matches the FACED sign\. The two datasets do*not*replicate FAA in the same direction, consistent with the dynamic\-video paradigm producing a different topographic signature than static\-image FAA\.
### Step 3: SEED\-V cross\-architecture convergence\.

Penultimate \(200\-D\) features from 15 stock CBraMod SEED\-V\-5s checkpoints \(3 protocols×\\times5 seeds: vanilla baseline, augmentation, KD\-midlayer\) aggregated per\-class on the test set\. Class\-level PC1/best\-PC scores correlated with SEED\-V V\-axis class projection giver​\(BACC,class\-best\-PC​r\)=\+0\.601r\(\\mathrm\{BACC\},\\text\{class\-best\-PC\}\\,r\)=\+0\.601\(p=0\.018p=0\.018,n=15n=15\) andr​\(BACC,trial\-best\-PC​r\)=\+0\.632r\(\\mathrm\{BACC\},\\text\{trial\-best\-PC\}\\,r\)=\+0\.632\(p=0\.012p=0\.012\)\. The unsigned class\-PC1 correlation is\+0\.358\+0\.358\(p=0\.19p=0\.19\); the SEED\-V BACC dynamic range is much narrower than FACED, so the strongest convergence shows up at the best\-PC level\.

### Step 4: SEED\-V V\-axis as supervision\.

Adding a V\-axis topo\-MSE auxiliary loss \(λvax=0\.05\\lambda\_\{\\mathrm\{vax\}\}\{=\}0\.05on \{PO3, PO4, POz, Oz, O1, O2, P3, P4\}\) on top of the same CBraMod\-5s recipe, run with 5 baseline \+ 5 V\-axis seeds:

Table 7:SEED\-V V\-axis\-as\-supervision \(within seed noise of zero\)\. Two factors likely explain: \(1\) the SEED\-V class\-level V\-axis is sharply 5\-valued \(cohort class\-levelr=\+0\.989r=\+0\.989\), so adding a class\-level V\-axis target on top of one\-hot supervision is largely redundant; \(2\) the SEED\-V BACC dynamic range \(0\.430\.43–0\.450\.45\) is much narrower than FACED \(0\.550\.55–0\.580\.58\), so a small effect is hard to detect with five seeds\.
### Step 5: SEED\-V directional ablation \(Arditi\-style\)\.

At inference time on each of the 15 SEED\-V CBraMod checkpoints \(5 baseline \+ 5 augmented \+ 5 KD\-midlayer\) we project penultimate features onto the orthogonal complement of the trial\-level best\-PC V\-axis direction \(no retraining; the class\-level direction is degenerate atk=5k\{=\}5, so we use the trial\-level analogue, mirroring thetrial\-best\-PC​r=\+0\.632\\text\{trial\-best\-PC\}\\,r\{=\}\+0\.632estimator from Step 3\)\. Every checkpoint shows a negativeΔ​BACC∈\[−0\.0291,−0\.0093\]\\Delta\\mathrm\{BACC\}\\in\[\-0\.0291,\-0\.0093\]\(population mean−0\.0173±0\.0061\-0\.0173\\pm 0\.0061\); a matched random\-direction control \(n=20n=20uniform on𝕊D−1\\mathbb\{S\}^\{D\-1\}\) is centred at0\(Δ¯rand=\+0\.00013±0\.0008\\bar\{\\Delta\}\_\{\\mathrm\{rand\}\}=\+0\.00013\\pm 0\.0008\)\. Per\-checkpointzz\-scores range\[\+9\.65,\+44\.92\]\[\+9\.65,\+44\.92\]\(mean\+23\.6\+23\.6, allp<0\.001p<0\.001parametric one\-sided\); empirically every V\-direction ablation exceeds every one of the 20 random ablations on every checkpoint \(15/1515/15at empiricalp<0\.05p<0\.05\)\. The SEED\-V directional effect is proportionally larger than FACED \(meanΔ/baseline≈3\.9%\\Delta/\\mathrm\{baseline\}\\approx 3\.9\\%vs\.2\.4%2\.4\\%on FACED\) and the meanzzis3×3\\timesFACED’s\. As on FACED, V\-axis amplification \(rather than ablation\) is not directionally consistent \(meanΔ≈0\\Delta\\approx 0, meanz≈−2z\\approx\-2\) — expected, since the SEED\-V V\-axis already aligns with class\-discriminative variance and doubling it does not help\. Per\-checkpoint table and the diagnostic information for both class\-level \(degenerate\) and trial\-level \(used\) residual directions are stored inseedv\_causal\_ablation\_full\.json\.

## Appendix S9Brain Topography Deep Dive

### Davidson FAA full pairs\.

Figure[9](https://arxiv.org/html/2606.00129#A9.F9)compares all three Davidson asymmetry pairs to the posterior occipital region\-mean reference\.

![Refer to caption](https://arxiv.org/html/2606.00129v1/x9.png)Figure 9:Frontal\-alpha asymmetry replicates in direction with smaller magnitude than posterior V\-axis encoding for these video stimuli\. \(a\) Anatomical scalp diagram with the three Davidson asymmetry pairs and theirrrvalues\. \(b\) Per\-pair alpha and delta asymmetry bars\. \(c\) Direct comparison to the occipital region\-mean\|r\|=0\.21\|r\|=0\.21\.
### 9\-stim contrast and Simpson’s paradox\.

Table[8](https://arxiv.org/html/2606.00129#A9.T8)decomposes the cohort\-level brain–LM correlation by stimulus subset, isolating which of the2828FACED stimuli carry the V\-axis signal\. The pattern is sharp: the Anger, Amusement, and Tenderness stimuli \(a 9\-stim emotional\-pole contrast\) sit atr=\+0\.870r=\+0\.870, while removing them collapses the residual1919stimuli tor=−0\.015r=\-0\.015\. The full\-cohortr=\+0\.478r=\+0\.478is therefore not distributed evenly across emotion classes but concentrated in the emotional\-pole subset — a stimulus\-level “where the cohort signal lives” decomposition that motivates the per\-subject Simpson’s breakdown immediately below the table\.

Table 8:The cohort V\-axis effect is essentially an*anger\-versus\-warm\-positive*contrast across nine stimuli\.- •Cohortrrat fixed top\-8 channels:\+0\.021\+0\.021
- •Mean per\-subjectrrat fixed top\-8 channels:−0\.062\-0\.062
- •Per\-subject best\-channel oracle \(1 cell\):\|r\|¯=0\.551\\overline\{\|r\|\}=0\.551
- •Per\-subject best \(channel, band\) oracle:\|r\|¯=0\.616\\overline\{\|r\|\}=0\.616

The all\-band oracle \(\|r\|¯=0\.616\\overline\{\|r\|\}=0\.616\) exceeds the cohort\-fixed top\-8 \(\+0\.021\+0\.021\) by\+0\.60\+0\.60in absolute\|r\|\|r\|, indicating considerable per\-subject signal that the cohort summary suppresses\. The Simpson’s\-paradox dynamic that Sani et al\.\(Vahidiet al\.,[2024](https://arxiv.org/html/2606.00129#bib.bib126)\)flag for cross\-subject EEG\-emotion modelling more broadly \(Figure[10](https://arxiv.org/html/2606.00129#A9.F10)\)\.

![Refer to caption](https://arxiv.org/html/2606.00129v1/x10.png)Figure 10:The cohort V\-axis is a 9\-stimulus contrast and a between\-subject phenomenon\. Top: drop\-class subset bars, per\-emotion drop\-Δ\\Delta, PO3/γ\\gammastim\-level scatter colour\-coded by emotion\. Bottom: per\-subjectrrdistribution at fixed top\-8 channels \(centred near 0, cohort line at\+0\.02\+0\.02\); per\-subject best\-channel oracle distribution \(centred at\|r\|≈0\.55\|r\|\\approx 0\.55\); summary callout\.
### Time\-resolved peaks\.

Figure[11](https://arxiv.org/html/2606.00129#A9.F11)traces per\-second cohort\|r\|\|r\|across all five frequency bands\.

- •Alpha\(88–1313Hz\): peak att=21t=21s, cohortr=0\.40r=0\.40; best individual stimulusr=0\.68r=0\.68\.
- •Beta\(1313–3030Hz\): peak att=18t=18s, cohortr=0\.41r=0\.41; best stimr=0\.62r=0\.62\.
- •Gamma\(3030–4747Hz\): peak att=3t=3–66s, cohortr=0\.21r=0\.21; best stimr=0\.61r=0\.61\.
- •Delta/Theta\(0\.50\.5–88Hz\): peak att=12t=12s, cohortr=0\.18r=0\.18\.

![Refer to caption](https://arxiv.org/html/2606.00129v1/x11.png)Figure 11:Time\-resolved V\-axis encoding across the 30\-second clip\.\(a\)Per\-second cohort\|r\|\|r\|across all2828stimuli, one trace per frequency band; the earlyγ\\gammapeak \(t≈2t\{\\approx\}2s\) coincides with Müller’s affective\-picture window and the sustainedα/β\\alpha/\\betapeak \(t≈17t\{\\approx\}17–2020s\) coincides with the LPP sustained\-attention window\.\(b\)Same trace restricted to the99\-stimulus emotional\-pole subset where the cohort effect lives;α\\alphareaches\|r\|≈0\.78\|r\|\{\\approx\}0\.78\.\(c\)Per\-subject distribution ofα\\alpha\-peak times: only∼\\sim24% of subjects peak inside the cohort 18–21 s window, evidence that the cohort signal is a between\-subject phenomenon rather than a tight common attractor\.
### Functional connectivity\.

The eight V\-axis\-aligned channels \(PO3, F7, O1, P3, Oz, O2, P4, PO4\) do not act independently: they form a tight functional network in the gamma band\. Pairwiseγ\\gamma\-DE correlation has mean off\-diagonalr=0\.675r=0\.675across the 8 channels — substantially above the cohort\-average ofr≈0\.32r\\approx 0\.32in this band\. Structure is dominantly a posterior\-occipital–parietal cluster \(PO3, PO4, P3, P4, O1, O2, Oz; pairwise meanr≈0\.71r\\approx 0\.71\) with F7 as a single*frontal hub*\(mean correlation to the seven posterior nodesr≈0\.47r\\approx 0\.47; Figure[12](https://arxiv.org/html/2606.00129#A9.F12)\)\.

![Refer to caption](https://arxiv.org/html/2606.00129v1/x12.png)Figure 12:V\-axis\-aligned channels form a coordinated network with F7 as frontal hub\. Left: scalp graph with 8 V\-axis channels as red nodes and edges weighted by pairwiseγ\\gamma\-DE correlation\. Right: hierarchically clustered8×88\\\!\\times\\\!8correlation matrix; F7 auto\-isolates as the frontal hub\.
### Mutual information vs\. linear correlation\.

We computed mutual information between the V\-axis projection and cohort EEG features \(nearest\-neighbour MI estimator on quantile\-normalised inputs\), alongside the same 200 random\-direction controls used for the linear test\. Observed MI0\.1120\.112; null mean0\.0360\.036;pMI=0\.115p\_\{\\mathrm\{MI\}\}=0\.115\(n\.s\.\) vs\. linearpr=0\.020p\_\{r\}=0\.020\. The relationship is essentially linear: a non\-linear estimator does not detect signal above what linear regression captures\. We use ridge regressions throughout\.

### Theta\-gamma coupling does not mediate the V\-axis\.

Tort modulation index per channel at the theta\-phase / gamma\-amplitude pairing, correlated with V\-axis encoding strength\|r\|\|r\|across 32 channels:ρ​\(PAC,V​r\)=\+0\.082\\rho\(\\mathrm\{PAC\},\\mathrm\{V\}r\)=\+0\.082,p=0\.667p=0\.667\. Channels with strong theta\-gamma CFC are not the channels that encode the V\-axis, and vice versa — the V\-axis encoding is dissociable from the phase\-amplitude\-coupling channel\(Canolty and Knight,[2010](https://arxiv.org/html/2606.00129#bib.bib124)\)\.

## Appendix S10Saturation: Full Intervention Table and Forest

This appendix lists every V\-axis\-as\-supervision intervention reported in the saturation regularity \(§[3](https://arxiv.org/html/2606.00129#S3)\)\. Table[9](https://arxiv.org/html/2606.00129#A10.T9)gives per\-recipeΔ\\DeltaBACC against the strong EMODd=6d\{=\}6\+ KD \+ aug baseline, the paired\-seedpp\-value, and a verdict tier; the two SOTA\-recipe replications below the line confirm the cliff persists at the BACC≈0\.66\\,\\approx\\,0\.66regime where saturation is operative\. Figure[13](https://arxiv.org/html/2606.00129#A10.F13)renders the same2525entries as a forest plot ranked by effect size, so the asymmetry is visible at a glance: every significant entry sits to the left of zero, and even the largest n\.s\. positive \(EMODSTYLE\-stim,\+0\.0066\+0\.0066,p=0\.22p\{=\}0\.22\) is within seed noise\. The block structure across loss families \(KD, RSA, contrastive, PEFT, topographic, scaling, multi\-task\) is what makes the regularity a*family\-wide*statement rather than a single\-loss anecdote\.

\#Family / variantΔ\\DeltappVerdict1Frontal\-maskλ=0\.5\\lambda\{=\}0\.5−0\.052\-0\.0520\.00150\.0015sig\. negative2Frontal\-maskλ=0\.1\\lambda\{=\}0\.1−0\.009\-0\.0090\.210\.21n\.s\.3FAAλ=0\.5\\lambda\{=\}0\.5−0\.044\-0\.0440\.0060\.006sig\. negative4FAAλ=0\.1\\lambda\{=\}0\.1−0\.018\-0\.0180\.0630\.063borderline5Anger\-weightedλ=0\.5\\lambda\{=\}0\.5−0\.054\-0\.0540\.00030\.0003sig\. negative6Occipitalλ=0\.1\\lambda\{=\}0\.1−0\.022\-0\.0220\.0070\.007sig\. negative7Topo\-optimalλ=0\.1\\lambda\{=\}0\.1−0\.013\-0\.0130\.0390\.039sig\. negative8Topo\-optimalλ=0\.05\\lambda\{=\}0\.05\+0\.0021\+0\.00210\.500\.50n\.s\. \(NULL\)9Procrustesλ=0\.05\\lambda\{=\}0\.05\+0\.0012\+0\.00120\.890\.89n\.s\. \(NULL\)10RSAλ=5\.0\\lambda\{=\}5\.0−0\.093\-0\.093<10−4<10^\{\-4\}sig\. negative11RSAλ=1\.0\\lambda\{=\}1\.0−0\.057\-0\.057<10−3<10^\{\-3\}sig\. negative12Distance\-CEτ=5\.0\\tau\{=\}5\.0−0\.397\-0\.397<10−4<10^\{\-4\}catastrophic13Multi\-Vλ=0\.5\\lambda\{=\}0\.5−0\.043\-0\.043<10−3<10^\{\-3\}sig\. negative14EEG\-AUX MSEλ=0\.5\\lambda\{=\}0\.5−0\.043\-0\.043<10−3<10^\{\-3\}sig\. negative15EEG\-AUX MSEλ=0\.1\\lambda\{=\}0\.1−0\.016\-0\.0160\.040\.04sig\. negative16EMODSTYLE classλ=1\.0\\lambda\{=\}1\.0−0\.010\-0\.0100\.180\.18n\.s\.17EMODSTYLE stimλ=0\.5\\lambda\{=\}0\.5\+0\.0066\+0\.00660\.220\.22n\.s\. \(sweet spot\)18PEFT fullheadλ=0\.1\\lambda\{=\}0\.1−0\.009\-0\.0090\.0690\.069borderline19PEFT LoRAλ=0\.1\\lambda\{=\}0\.1−0\.010\-0\.0100\.120\.12n\.s\.20PEFT IA3λ=0\.1\\lambda\{=\}0\.1−0\.016\-0\.0160\.050\.05sig\. negative21Pretrain\-FT \(frozen\)−0\.401\-0\.401<10−4<10^\{\-4\}catastrophic22Pretrain\-FT \(unfrozen\)−0\.190\-0\.190<10−4<10^\{\-4\}sig\. negative23XEEG \(FACED→\\toSEED\-V\)\+0\.0015\+0\.00150\.970\.97n\.s\. \(zero\)24SCALING \(full data\)λ=0\.5\\lambda\{=\}0\.5−0\.045\-0\.045<0\.01<0\.01sig\. negative25Uncertainty multi\-task−0\.067\-0\.067<10−3<10^\{\-3\}sig\. negativeSOTA recipe \+ Topoλ=0\.1\\lambda\{=\}0\.1−0\.015\-0\.0150\.0200\.020sig\. negative \(cliff\)SOTA recipe \+ EMODSTYLE−0\.024\-0\.0240\.0010\.001sig\. negative \(cliff\)Table 9:Full V\-axis\-as\-supervision intervention results\. 16 / 25 families produce a statistically significant negative \(14 sig\. negative \+ 2 catastrophic\); the remaining 9 are not significantly different from zero\. None reachp<0\.05p<0\.05in the positive direction at the strong baseline\. Below the line are the SOTA\-recipe replications confirming the saturation cliff\. Forest plot in Figure[13](https://arxiv.org/html/2606.00129#A10.F13)\.![Refer to caption](https://arxiv.org/html/2606.00129v1/x13.png)Figure 13:Forest plot of all V\-axis\-as\-supervision interventions \(Table[9](https://arxiv.org/html/2606.00129#A10.T9)\) ranked byΔ\\Deltavs\. baseline, with significance tier coloured\. None of the 25 reachp<0\.05p<0\.05in the positive direction at the strong baseline\.
## Appendix S11Saturation Extras: Anger Paradox, Mechanism Check, Path B

### Saturation transition table\.

Table[10](https://arxiv.org/html/2606.00129#A11.T10)pairs the same V\-axis intervention \(Topo or EMODSTYLE auxiliary loss\) against five base recipes spanning the BACC range from0\.5720\.572\(CBraMod baseline\) to0\.65810\.6581\(EMODd=6d\{=\}6SOTA pre\-ensemble\)\. Reading top\-to\-bottom traces the saturation transition: at low BACC the auxiliary loss is within seed noise \(rows 1–3,\|Δ\|≤0\.007\|\\Delta\|\\leq 0\.007, all n\.s\.\), but as the base recipe strengthens, the same loss becomes a statistically significant*negative*\(rows 4–5,Δ∈\{−0\.015,−0\.024\}\\Delta\\in\\\{\-0\.015,\-0\.024\\\},p<0\.05p<0\.05andp<0\.01p<0\.01\)\. No row crosses the boundary in the helpful direction\. This is the data behind the “no positive setpoint” wording in §7 and the basis for treatingBACC≈0\.66\\mathrm\{BACC\}\\approx 0\.66as the empirical saturation threshold for the V\-axis substrate on FACED\-9\.

Table 10:V\-axis supervision is within seed noise at weak baselines and statistically significantly*harms*the strong SOTA recipe\. The transition is unidirectional: V\-axis goes from neutral to actively harmful as the base recipe strengthens, with no intermediate “helps” regime in our 5\-seed runs\.
### Saturation cliff figure\.

Figure[14](https://arxiv.org/html/2606.00129#A11.F14)plotsΔ\\DeltaBACC against base\-recipe BACC across all2525interventions, showing the unidirectional sign\-flip in the\[0\.62,0\.66\]\[0\.62,0\.66\]band\.

![Refer to caption](https://arxiv.org/html/2606.00129v1/x14.png)Figure 14:Δ\\DeltaBACC from V\-axis supervision plotted against base\-recipe BACC across∼\\sim25 interventions\. The sign of the effect transitions in the interval\[0\.62,0\.66\]\[0\.62,0\.66\]\.
### Why anger\-weighting hurts despite the analytical ceiling\.

Section[7](https://arxiv.org/html/2606.00129#S7)showed that the cohort EEG signal is carried by a 9\-stimulus emotional\-pole contrast \(Anger \+ Amusement \+ Tenderness,r=0\.870r=0\.870at PO3/γ\\gamma\)\. Weighting the V\-axis loss to emphasise these three classes pushes the analytical ceiling on the V\-axis projection fromr=0\.478r=0\.478\(uniform\) tor=0\.714r=0\.714\(anger\-weighted\)\. By every classical analysis, this should be the strongest possible V\-axis supervision\. It is the worst single intervention we tested:Δ=−0\.054\\Delta=\-0\.054,p=0\.0003p=0\.0003\. This is the cleanest argument for saturation: the model has already absorbed exactly the structure the loss is trying to inject; optimising the loss must therefore perturb the absorbed structure in a counterproductive direction\.

### Direct mechanism check\.

For each of 8 V\-axis\-supervised variants \(Topo atλ=0\.05\\lambda\{=\}0\.05, Procrustes atλ=0\.05\\lambda\{=\}0\.05, EMODSTYLE\-stim atλ=0\.5\\lambda\{=\}0\.5, full SOTA recipe \+ Topo/EMODSTYLE ate=100e\{=\}100/e=150e\{=\}150\), we measure the change in two quantities relative to the matched\-recipe baseline:Δ​rPC1\\Delta r\_\{\\mathrm\{PC1\}\}\(class\-mean V\-axis encoding\) andΔ​rresid\\Delta r\_\{\\mathrm\{resid\}\}\(within\-class V\-axis residual encoding\)\. V\-axis training raises class\-PC1\|r\|\|r\|by\+0\.01\+0\.01to\+0\.36\+0\.36across the 8 variants, but moves the within\-class residual encoding by only∼10−7\\sim 10^\{\-7\}in absolute value — numerical zero\. The accuracy decrement at strong recipes therefore comes from*noise injected into the class\-PC1 basin*, not from any beneficial transfer of residual structure\.

### Path B: ensembling\-in V\-axis checkpoints\.

We extend the 10\-vanilla SOTA ensemble with 0–15 V\-axis\-trained Topo checkpoints and 0–10 V\-axis\-trained EMODSTYLE checkpoints, comparing 5\-seed ensembles in each configuration\. Adding 15 Topo V\-axis ckpts to the 10\-vanilla pool gives ensemble BACCΔ=−0\.0145\\Delta=\-0\.0145\(p<10−3p<10^\{\-3\}\); adding all 25 V\-axis ckpts givesΔ=−0\.0193\\Delta=\-0\.0193\(p<10−3p<10^\{\-3\}\)\. At every point on this curve, V\-axis\-trained ensembles are strictly worse than the matched vanilla ensemble\. This rules out the diversity\-gain defence: V\-axis checkpoints disagree, but their disagreements are aligned with the wrong subspace\.

## Appendix S12Convergence and Ensemble Theory: Extras

### Per\-architecture breakdown of the 36\-checkpoint correlation\.

CBraMod checkpoints \(lower BACC,∼0\.57\\sim 0\.57\) cluster at low V\-axis encoding strength \(mean class\-PC1\|r\|=0\.21\|r\|=0\.21\)\. EMOD\-vanilla d6 \(∼0\.65\\sim 0\.65BACC\) sits in the middle \(mean0\.670\.67\); EMOD d6 e150 \(∼0\.66\\sim 0\.66BACC\) clusters at the top \(mean0\.690\.69\)\. The trend spans the full BACC range\[0\.57,0\.69\]\[0\.57,0\.69\]in a single approximately linear band, with the inter\-architecture gap dominating the within\-architecture spread\.

### Random\-direction null detail\.

We sample 1000 random Gaussian directionsw∈ℝ28w\\in\\mathbb\{R\}^\{28\}each matched inL2L\_\{2\}norm tovCLIPv\_\{\\mathrm\{CLIP\}\}, and recompute the 36\-checkpoint correlationρw=Pearson​\(BACC,\|corr​\(PC1​\(H​\(m\)\),w\)\|\)\\rho\_\{w\}=\\mathrm\{Pearson\}\(\\mathrm\{BACC\},\\,\|\\mathrm\{corr\}\(\\mathrm\{PC\}\_\{1\}\(H\(m\)\),w\)\|\)\. The distribution ofρw\\rho\_\{w\}has mean−0\.03\-0\.03and standard deviation0\.620\.62\(range\[−0\.95,\+0\.95\]\[\-0\.95,\+0\.95\]\); the wide null is a consequence of the low intrinsic dimensionality of the class\-PC1 subspace \(Deff=9D\_\{\\mathrm\{eff\}\}\\\!=\\\!9\)\. The empirical V\-axis correlation\+0\.885\+0\.885sits at the93\.5th93\.5^\{\\mathrm\{th\}\}percentile of this null \(pone=0\.066p\_\{\\mathrm\{one\}\}=0\.066\)\. We treat the within\-class residual \(Dresid≈Dm−9D\_\{\\mathrm\{resid\}\}\\approx D\_\{m\}\-9\) as the statistically robust signal\.

### Class\-PC1 basin saturation\.

The 10 SOTA\-pool checkpoints exhibit a saturated class\-PC1 V\-axis range:\|r\|∈\[0\.60,0\.77\]\|r\|\\in\[0\.60,0\.77\]across all 10 seeds \(mean0\.690\.69, std0\.050\.05\)\. Single\-seed variation in class\-mean V\-axis structure is small\. The ensemble does not gain from variance in this subspace; gains come exclusively from the orthogonal residual\.

### Directional ablation figure\.

Figure[15](https://arxiv.org/html/2606.00129#A12.F15)shows the per\-checkpoint causal effect: ablatingvresidv\_\{\\mathrm\{resid\}\}drops BACC byΔ¯=−0\.0157\\overline\{\\Delta\}=\-0\.0157on all1010pool members, well above the matched\-norm random\-direction null \(meanz≈7\.7z\\approx 7\.7, allp<0\.001p<0\.001\)\.

![Refer to caption](https://arxiv.org/html/2606.00129v1/x15.png)Figure 15:Directional ablation ofvresidv\_\{\\mathrm\{resid\}\}on all 10 SOTA\-pool checkpoints\.*Top*: per\-checkpoint BACC drops \(red\) compared to matched\-norm random\-direction ablation \(grey\); the V\-axis effect is well above the null on every checkpoint \(meanz≈7\.7z\\approx 7\.7, allp<0\.001p<0\.001\)\.*Bottom*: per\-checkpoint ablationΔ\\DeltaBACC vs\. within\-class residual encoding\|r\|\|r\|; the population\-mean drop \(Δ¯=−0\.0157\\overline\{\\Delta\}=\-0\.0157\) confirms the residual\-encoding mechanism across the pool\.

## Appendix S13Ensemble Generality, Mega\-Pool, and Best\-Single

### Recipe cascade and two\-tier scatter\.

Figure[16](https://arxiv.org/html/2606.00129#A13.F16)traces the seven recipe steps from CBraMod’s0\.5720\.572to our0\.69480\.6948ensemble\. Figure[17](https://arxiv.org/html/2606.00129#A13.F17)shows the per\-checkpoint within\-class residual–ensemble\-contribution correlation that motivates the ensembling strategy\.

![Refer to caption](https://arxiv.org/html/2606.00129v1/x16.png)Figure 16:Recipe ablation cascade from the EMOD baseline at0\.62870\.6287\(previous FACED\-9 SOTA\(Chenet al\.,[2025](https://arxiv.org/html/2606.00129#bib.bib101)\)\) to our1010\-checkpoint ensemble at0\.69480\.6948\(\+10\.5%\+10\.5\\%relative\)\. Each bar is one recipe component or ensembling step\. Older baselines \(CBraMod0\.5720\.572, EmotionKD0\.6280\.628\) shown for reference\.![Refer to caption](https://arxiv.org/html/2606.00129v1/x17.png)Figure 17:Two\-tier ensemble theory\. Per\-checkpoint within\-class V\-axis residual\|r\|\|r\|predicts leave\-one\-out ensemble contribution \(r=\+0\.74r\{=\}\+0\.74,p=0\.014p=0\.014,n=10n=10\)\. Top\-7 vs\. bottom\-7 split highlighted\.
### Generality of the ensemble mechanism\.

Figure[18](https://arxiv.org/html/2606.00129#A13.F18)extends the two\-tier picture across four benchmarks \(FACED, SEED\-V, CIFAR\-10, MNIST\)\.

![Refer to caption](https://arxiv.org/html/2606.00129v1/x18.png)Figure 18:Two\-tier ensemble gain across four benchmarks\.Δ\\DeltaBACC is largest where single\-model accuracy is furthest from the dataset ceiling, and shrinks to near\-zero on MNIST where the saturated regime leaves no within\-class residual variance to cancel\. \(a\) Scatter ofΔ\\Deltavs\. headroom1−Acc¯1\{\-\}\\bar\{\\mathrm\{Acc\}\}, with linear fit slope∼0\.05\\sim 0\.05\. \(b\) Paired single\-seed\-mean vs\. 5\-seed\-ensemble bars per benchmark, sorted by headroom\.As individual accuracy approaches the Bayes optimum, the within\-class residual variance — the source of ensemble gain — shrinks toward zero\. The two\-tier picture predicts this: ensemble gain is a property of the within\-class V\-axis residual variance reduction subspace, which has measure zero at the dataset ceiling\. SEED\-V at0\.370\.37has the largest residual\-variance ceiling; we observe\+0\.034\+0\.034gain\. MNIST at0\.9850\.985has essentially none; we observe\+0\.001\+0\.001gain\.

### Mega\-ensemble null\.

A natural “more is better” hypothesis is that growing the ensemble beyond1010checkpoints by adding architectural variants \(d∈\{8,10\}d\\in\\\{8,10\\\}on top ofd=6d\{=\}6, plus LLM\-KD checkpoints\) should keep raising BACC\. Table[11](https://arxiv.org/html/2606.00129#A13.T11)tests that hypothesis directly\. The1010\-checkpointd=6d\{=\}6pool \(e=100e\{=\}100\+e=150e\{=\}150, our SOTA\) sits at the top; every wider pool is monotonically lower\. The mechanism check follows the table\.

Table 11:Mega\-ensemble null\. Adding architectural variants beyondd=6d\{=\}6produces monotonically lower ensemble BACC\. Thed=6d\{=\}6\+e=100e\{=\}100/e=150e\{=\}150pool is the optimum within the architectures evaluated\.Two mechanisms explain the null\. First, deeper variants \(d=8,d=10d\{=\}8,d\{=\}10\) stay in the same class\-PC1 V\-axis basin \(mean class\-PC1\|r\|\|r\|acrossd∈\{4,6,8,10\}d\\in\\\{4,6,8,10\\\}:0\.43,0\.43,0\.39,0\.380\.43,0\.43,0\.39,0\.38\) and their within\-class residuals overlap more thand=6d\{=\}6seeds do\. Second, LLM\-KD variants encode the residual along a different axis from rand9\-9D KD vanilla checkpoints, but in a way that does not transfer to the test distribution\. Cross\-architecture mixing \(d∈\{4,6,8,10\}d\\in\\\{4,6,8,10\\\}\) gives0\.67910\.6791, essentially tied with within\-arch x\-seed at0\.67980\.6798\. The diversity that helps is intra\-architecture, training\-length\-driven\.

### Cohenκ\\kappadisagreement structure\.

Single\-seed accuracy ate=100e\{=\}100\(0\.65810\.6581\) ande=150e\{=\}150\(0\.65810\.6581\) is identical\. Cohenκ\\kappawithin\-e=100e\{=\}100is0\.6940\.694, within\-e=150e\{=\}1500\.6790\.679, cross\-group0\.7020\.702\(higherκ\\kappa= more agreement\)\. All three are similar — within and across groups, the per\-trial prediction patterns are almost as concordant ase=100e\{=\}100checkpoints are with each other — yet mixinge=100e\{=\}100ande=150e\{=\}150gives\+0\.014\+0\.014over either group alone\. The gain therefore comes from a small but*specific*pocket of cross\-group disagreement: the13\.1%13\.1\\%of test trials wheree=100e\{=\}100ande=150e\{=\}150pools differ, on which the mixed ensemble gains\+10\.3\+10\.3percentage points\. The diversity is targeted \(which trials\), not population\-wide \(κ\\kappa\)\.

### Best single checkpoint without ensembling\.

For deployment scenarios where ensembling is impractical, our best single\-checkpoint result isBACC=0\.6755\\mathrm\{BACC\}=0\.6755on thed=6d\{=\}6e=150e\{=\}150recipe, seed789789, val\-selected\. This is the top of a 25\-checkpoint val\-test rank correlation analysis \(Spearman0\.8250\.825,n=25n=25\) and exceeds CBraMod by\+0\.103\+0\.103and EMOD by\+0\.047\+0\.047on a strict apples\-to\-apples test split\.

## Appendix S14EEG Model Training Details

### Architecture\.

EMOD axial transformer\(Chenet al\.,[2025](https://arxiv.org/html/2606.00129#bib.bib101)\): input32×532\\times 5DE features, axial self\-attention over channels and bands at depthd∈\{3,6\}d\\in\\\{3,6\\\}, filter dimensionf=128f=128\. The full SOTA recipe isd=6d=6,f=128f=128\.

### Optimisation\.

AdamW withlr=10−3\\mathrm\{lr\}=10^\{\-3\},β1=0\.9\\beta\_\{1\}=0\.9,β2=0\.999\\beta\_\{2\}=0\.999, weight decay10−210^\{\-2\}\. Cosine schedule with 5\-epoch linear warmup\. Batch size128128\. Training lengthe∈\{100,150\}e\\in\\\{100,150\\\}epochs\. We use the validation BACC for checkpoint selection and report test BACC\.

### Augmentation\.

Per\-trial Gaussian noise \(σ=0\.05⋅std\\sigma=0\.05\\cdot\\mathrm\{std\}of feature distribution\), random channel dropout \(p=0\.15p=0\.15\), and random temporal masking \(up to5%5\\%of seconds\)\. All augmentations are applied withp=0\.6p=0\.6during training only\.

### Knowledge distillation\.

Soft\-target KD with rand9 9\-D orthonormal teacher \(no LLM content\);λKD=0\.5\\lambda\_\{\\mathrm\{KD\}\}=0\.5,T=1\.0T=1\.0\. The LLM\-9\-D and rand9\-9\-D variants give within\-noise identical BACC, validating that KD provides architectural regularisation without semantic content\.

### Ensemble\.

10 checkpoints split as 5 seeds ate=100e=100\+ 5 seeds ate=150e=150\. Seeds:\{42,123,456,789,2025\}\\\{42,123,456,789,2025\\\}each\. Ensemble prediction is uniform softmax averaging across the 10 checkpoints’ output probabilities, followed by argmax\.

## Appendix S15Statistical Methods

### Bootstrap CIs\.

Per\-subject and per\-stim 95% CIs are computed fromB=10,000B=10\{,\}000subject\-resampled bootstraps\. For cohort correlations, we report the bootstrap median, 2\.5% / 97\.5% percentile interval, and a Fisher\-zzpp\-value forr=0r=0\.

### Random\-direction null\.

For cohort EEG–LLM correlations, we sampleN=200N=200random Gaussian directions inℝ28\\mathbb\{R\}^\{28\}matched inL2L\_\{2\}norm to the V\-axis projection, recompute the cohort\|r\|\|r\|for each, and report the percentile of the empirical\|r\|\|r\|in the null distribution\.

### Cross\-architecture null\.

For the 36\-checkpoint cross\-arch correlation, we sampleN=1000N=1000random directions matched in norm tovCLIPv\_\{\\mathrm\{CLIP\}\}and recompute the BACC–V\-axis correlation\. The null is distributed broadly \(σ=0\.62\\sigma=0\.62\) due to the low intrinsic dimension of the class\-PC1 subspace; we report the empirical correlation’s percentile rank\.

### Per\-checkpoint LOO contribution\.

For the 10\-vanillad=6d\{=\}6ensemble, we computeΔk=BACC​\(ens\)−BACC​\(ens∖k\)\\Delta\_\{k\}=\\mathrm\{BACC\}\(\\mathrm\{ens\}\)\-\\mathrm\{BACC\}\(\\mathrm\{ens\}\\setminus k\)for eachkk\. Reported correlations betweenΔk\\Delta\_\{k\}and per\-ckpt features are Pearson withn=10n=10\.

### V\-axis intervention comparisons\.

For each intervention,Δ\\Deltais computed seed\-paired with the matched\-recipe vanilla baseline;pp\-values are pairedtt\-tests across 5 seeds\.

## Appendix S16Discussion Extras and Future Work

### Broader impacts\.

Improved EEG emotion classifiers have clear positive applications in mental\-health monitoring, affective brain–computer interfaces, and clinical phenotyping of affect disorders\. The same techniques pose risks of affective inference without consent in surveillance, hiring, workplace, education, and advertising contexts\. The V\-axis extraction protocol applied to EEG could in principle reduce the calibration\-data burden for affective inference systems — a feature that cuts both ways, since lower calibration cost lowers the deployment threshold for both clinical and surveillance settings\. We do not release human EEG data; we release training configs, feature\-side and analysis code, and extracted V\-axis directions\. Recommended deployment safeguards: informed consent, opt\-in only, on\-device inference, no longitudinal storage of raw EEG outside research contexts, and IRB review for any non\-clinical inference application\. Dual\-use risk for the V\-axis extraction itself is low: it operates on text and image features already public, and the EEG mapping requires per\-cohort EEG access that is governed by the original dataset licences\.

### Implications for affective neuroscience\.

For video\-evoked emotion on FACED, the V\-axis is encoded predominantly in posterior visual cortex \(\|r\|occipital=0\.21\|r\|\_\{\\mathrm\{occipital\}\}=0\.21vs\.\|r\|frontal=0\.16\|r\|\_\{\\mathrm\{frontal\}\}=0\.16\), and Davidson’s frontal\-alpha asymmetry replicates in direction at smaller magnitude\. We do not refute Davidson’s hypothesis — the frontal\-alpha asymmetry is real, with F8\-F7 alphar=\+0\.0155r=\+0\.0155and Fp2\-Fp1 alphar=\+0\.0116r=\+0\.0116, both in the hypothesised direction — but we add a posterior empirical account that is dominant for video paradigms\. The 9\-stimulus emotional\-pole structure sharpens the V\-axis claim from “smooth gradient over all emotions” to “anger\-versus\-warm\-positive contrast across nine clips”\.

### Per\-subject V\-axis adaptation does not transfer\.

The per\-subject best\-channel oracle reaches\|r\|¯=0\.62\\overline\{\|r\|\}=0\.62\(a\+0\.60\+0\.60absolute headroom over the cohort\-fixed top\-8 of\|r\|=0\.02\|r\|=0\.02\)\. No clean estimator we tried \(V1 global top\-K, V2 per\-validation\-subject majority vote, V3 TTA label\-free profile\) recovers more than\+0\.001\+0\.001over the cohort top\-K\. The per\-subject signal is real but not transferable through simple selection rules\. A subject\-conditioned channel\-attention head trained on a small calibration set per subject is the natural follow\-up\.

### Future work\.

Three lines: \(i\) per\-subject V\-axis adaptation closing the\|r\|=0\.62\|r\|=0\.62oracle gap via subject\-conditioned channel\-attention; \(ii\) replicating the saturation regularity for other concept directions\(Arditiet al\.,[2024](https://arxiv.org/html/2606.00129#bib.bib72)\)\(arousal, dominance, formality, refusal\) on their respective benchmarks; \(iii\) using the within\-class residual\|r\|\|r\|as an online model\-selection signal during training rather than as a post\-hoc diagnostic\.

## Appendix S17FACED Test Confusion Matrices

Figure[19](https://arxiv.org/html/2606.00129#A17.F19)contrasts the best\-single\-checkpoint and 10\-checkpoint\-ensemble confusion matrices on the FACED 9\-class test split\.

![Refer to caption](https://arxiv.org/html/2606.00129v1/x19.png)Figure 19:FACED 9\-class test confusion\. Left: best single checkpoint \(BACC0\.67550\.6755\)\. Right: 10\-checkpoint ensemble \(BACC0\.69480\.6948\)\. The ensemble’s gain is concentrated on the off\-diagonal cells where single\-seed disagreement is highest, consistent with the within\-class residual mechanism of Section[8](https://arxiv.org/html/2606.00129#S8)\.
## Appendix S18Negative Results Catalogue

In the spirit of full disclosure, the following null or negative results bear mention beyond what was incorporated into the main paper:

1. 1\.Theta\-gamma cross\-frequency coupling does not mediate the V\-axis\(Section[7](https://arxiv.org/html/2606.00129#S7), Appendix[S9](https://arxiv.org/html/2606.00129#A9)\): Tort modulation indexρ​\(PAC,V​r\)=\+0\.082\\rho\(\\mathrm\{PAC\},\\mathrm\{V\}r\)=\+0\.082,p=0\.667p=0\.667\.
2. 2\.Mutual\-information control is not significant\(Section[6](https://arxiv.org/html/2606.00129#S6)\): MIp=0\.115p=0\.115vs\. linearp=0\.020p=0\.020\. The V\-axis\-EEG relationship is essentially linear\.
3. 3\.Random\-direction null on cross\-arch correlation\(Section[4](https://arxiv.org/html/2606.00129#S4)\):r=\+0\.885r=\+0\.885at the93\.5th93\.5^\{\\mathrm\{th\}\}percentile of a 1000\-direction null,pone=0\.066p\_\{\\mathrm\{one\}\}=0\.066\. Within\-class residual \(r=\+0\.74r=\+0\.74,p=0\.014p=0\.014\) is the statistically robust signal\.
4. 4\.Per\-subject V\-axis adaptation does not transfer\.The per\-subject best\-channel oracle reaches\|r\|=0\.62\|r\|=0\.62, but no clean estimator transfers more than\+0\.001\+0\.001over the cohort top\-K\.
5. 5\.Cross\-architecture ensembling does not add diversity\.Mixingd∈\{4,6,8,10\}d\\in\\\{4,6,8,10\\\}checkpoints gives within\-arch0\.67980\.6798vs\. cross\-arch0\.67910\.6791\(n\.s\.\)\.
6. 6\.Mega\-ensembles plateau at 10 checkpoints\.15\-, 20\-, 25\-checkpoint pools all hit BACC≤0\.6948\\leq 0\.6948\(Appendix[S13](https://arxiv.org/html/2606.00129#A13)\)\.
7. 7\.Stimulus\-aggregation re\-ranking does not help\.Operating at the 28\-stimulus level instead of trial level gives BACC=1\.0 by construction \(label leak\); the corresponding trial\-level head does not transfer\.
8. 8\.Test\-time augmentation at K=5 does not improve the ensemble\(0\.6753→0\.67200\.6753\\to 0\.6720\)\. Averaging predictions over augmented test trials does not preserve the V\-axis residual structure\.
9. 9\.Toxicity: recipe scope\.The 9\-prompt PCA recipe attains AUC0\.590\.59on Jigsaw toxic comments, below the 17/20 working concepts in our library \(Appendix[S4](https://arxiv.org/html/2606.00129#A4)\); toxicity appears to be encoded in a higher\-rank subspace than the late\-layer PC1\.
10. 10\.Arousal asymmetry: vision yes, text and brain no\.OASIS arousalr=0\.803r=0\.803in vision;≤13%\\leq 13\\%NRC arousal recovery in text;r∈\[0\.18,0\.41\]r\\in\[0\.18,0\.41\]brain alignment across LLMs \(Appendix[S6](https://arxiv.org/html/2606.00129#A6)\)\.
11. 11\.LLM\-content of KD is irrelevant at this scale\.Replacing the 9\-D LLM\-derived class prototypes with random orthonormal 9\-D directions changes BACC by≤0\.003\\leq 0\.003\. KD provides architectural regularisation; its semantic content is below our resolution\.

## Appendix S19Reproducibility

### Release commitment\.

Code, configs, model checkpoints, and figure\-generation scripts will be released upon acceptance \(no submission\-time anonymous URL is provided\)\. The release will include: V\-axis extraction scripts for the 14 LLMs in the per\-LLM table; the EMODd=6d\{=\}6training pipeline with the exact augmentation, KD, and ensemble\-construction code; the 10\-checkpoint0\.69480\.6948\-BACC ensemble checkpoints; and the figure\-generation scripts for every landmark and neuro figure in the paper\.

### Run\-level provenance\.

Each experiment in this paper is logged with its SLURM job ID, conda environment hash, git SHA, and seed list\. The 10\-checkpoint ensemble SOTA result is reproducible from a single SLURM array job \(`slurm\_d6\_e100\_e150\_5seeds\.sh`,∼\\sim10 GPU\-hours per seed on V100\)\.

## Appendix S20Resource Estimate

The full paper used approximately4,5004\{,\}500GPU\-hours on V100 nodes:∼600\\sim 600for the recipe ablation cascade,∼1,200\\sim 1\{,\}200for the 25 V\-axis\-supervision interventions,∼800\\sim 800for the 36\-checkpoint cross\-architecture analysis,∼1,200\\sim 1\{,\}200for ensemble construction and val\-test rank diagnostics, and∼700\\sim 700for cross\-dataset generality experiments\. The V\-axis extraction itself is CPU\-cheap \(∼10\\sim 10minutes per LM on a single 80GB GPU\)\.

Similar Articles

Negative Before Positive: Asymmetric Valence Processing in Large Language Models

arXiv cs.CL

This paper investigates how large language models process emotional valence through mechanistic interpretability. Using activation patching and steering on three open-source LLMs, the authors find that negative valence is localized to early layers while positive valence peaks in mid-to-late layers, and they validate this through topic-controlled flip tests.

The Grounding Gap: How LLMs Anchor the Meaning of Abstract Concepts Differently from Humans

arXiv cs.CL

This study investigates how LLMs ground abstract concepts compared to humans, finding a significant 'grounding gap' where models rely heavily on word associations rather than emotional or internal states. Using sparse autoencoders, the authors identify internal features related to grounding dimensions, suggesting LLMs possess this information but do not recruit it naturally during generation.