MagBridge-Battery: A Synthetic Bridge Dataset for Li-ion Magnetometry and State-of-Health Diagnostics

arXiv cs.LG Papers

Summary

This paper introduces MagBridge-Battery, a synthetic dataset of 6,760 magnetic-field signatures for Li-ion battery state-of-health diagnostics, combining real magnetic morphology with real degradation labels to bridge the gap in public magnetic-sensing battery data.

arXiv:2605.20240v1 Announce Type: new Abstract: Battery health diagnostics today rely overwhelmingly on electrochemical signals measured at the cell terminals. A parallel literature has shown that magnetic sensing can resolve information that terminal-only measurements miss, but method development is limited by the absence, to the best of our knowledge, of public battery magnetic-measurement datasets paired with degradation labels. We release MagBridge-Battery v1.0, a synthetic dataset of 6,760 magnetic-field signatures that bridges real magnetic morphology from the Mohammadi-Jerschow Open Science Framework (OSF) archive with state-of-health (SOH) labels from the PulseBat dataset. The release contains 5,600 PulseBat-conditioned grounded samples, 600 synthetic sensor-anomaly samples derived from clean parents, and 560 low-voltage Regime-B extrapolation samples. A cell-disjoint, parent-child-leakage-free primary benchmark split is verified to contain zero overlapping cells, zero cross-split parent-child pairs, and zero sample-ID overlap. We define three primary benchmark tasks: SOH regression, second-life classification, and anomaly detection, plus an auxiliary anomaly-subtype classification task. A controlled label-shuffle ablation collapses SOH regression from R^2 approximately 0.77 to approximately 0, confirming that the bridge encodes input SOH non-trivially rather than producing label-aligned artifacts. The dataset is released on Zenodo under CC-BY-4.0, and the bridge code and benchmark suite are released under Apache-2.0. This work provides a public benchmark for magnetic-sensing battery diagnostics while paired magnetic-electrochemical measurements remain scarce.
Original Article
View Cached Full Text

Cached at: 05/21/26, 06:20 AM

# MagBridge-Battery: A Synthetic Bridge Dataset for Li-ion Magnetometry and State-of-Health Diagnostics
Source: [https://arxiv.org/html/2605.20240](https://arxiv.org/html/2605.20240)
###### Abstract

Battery health diagnostics today rely overwhelmingly on electrochemical signals at the cell terminals\. A parallel literature has shown that magnetic sensing can resolve information that terminal\-only measurements miss, but the central obstacle to method development in this area is the absence, to the best of our knowledge, of public battery magnetic\-measurement datasets paired with degradation labels\. We releaseMagBridge\-Battery v1\.0, a synthetic dataset of 6,760 magnetic\-field signatures bridging real magnetic morphology from the Mohammadi–Jerschow Open Science Framework \(OSF\) archive with real state\-of\-health \(SOH\) labels from the PulseBat dataset\. The release contains 5,600 PulseBat\-conditioned grounded samples, 600 synthetic sensor\-anomaly samples derived from clean parents, and 560 low\-voltage Regime\-B extrapolation samples\. A cell\-disjoint, parent\-child\-leakage\-free primary benchmark split is verified to contain zero overlapping cells, zero cross\-split parent\-child pairs, and zero sample\-ID overlap\. We define three primary benchmark tasks \(SOH regression, second\-life classification, anomaly detection\) and one auxiliary anomaly\-subtype classification task, and validate the dataset with a controlled ablation suite: a label\-shuffle ablation collapses SOH regression fromR2≈0\.77R^\{2\}\\approx 0\.77to≈0\\approx 0, confirming that the bridge encodes input SOH non\-trivially rather than producing label\-aligned artefacts\. The dataset is released on Zenodo under CC\-BY\-4\.0; the bridge code and benchmark suite are released under Apache\-2\.0\. This work bridges a gap in public data for magnetic\-sensing battery diagnostics while paired magnetic–electrochemical measurements remain scarce\.

## IMotivation

Battery health diagnostics today rely overwhelmingly on*electrochemical signals*: voltage, current, temperature, and impedance measurements taken at the cell terminals\. The public datasets that drive method development reflect this\. NASA, Oxford, CALCE, Stanford/Severson, MATR, HUST, XJTU, and PulseBat all provide rich electrochemical time\-series at varying degrees of degradation, with well\-characterised SOH labels and, in many cases, full degradation trajectories\[[10](https://arxiv.org/html/2605.20240#bib.bib1),[11](https://arxiv.org/html/2605.20240#bib.bib2)\]\. These datasets have enabled a generation of work on early\-life prediction, capacity\-fade estimation, and second\-life classification\.

What none of them capture is the*spatial distribution of current and magnetisation inside the cell*\. A terminal\-only measurement is, by construction, blind to localised hot\-spots of charge storage, inhomogeneous redox at electrode surfaces, dendrite formation, and the kinds of internal defects that emerge during ageing or that signal manufacturing flaws\. A parallel research literature, much smaller and almost entirely without public data, has shown that magnetometry can resolve precisely this missing information\[[6](https://arxiv.org/html/2605.20240#bib.bib7),[9](https://arxiv.org/html/2605.20240#bib.bib9),[4](https://arxiv.org/html/2605.20240#bib.bib11),[5](https://arxiv.org/html/2605.20240#bib.bib12),[8](https://arxiv.org/html/2605.20240#bib.bib14),[1](https://arxiv.org/html/2605.20240#bib.bib15)\]\.

This work has accelerated significantly in the past several years\. Optically\-pumped magnetometers, nitrogen\-vacancy diamond sensors, and SQUID arrays have matured into measurement systems suitable for routine use on commercial cells\. The QuaLiProM consortium \(Fraunhofer IFAM, FAU Erlangen, and industrial partners; BMBF\-funded; running through November 2026\) is explicitly building an industrial pipeline that combines magnetometry with deep learning for second\-life classification of retired batteries\[[2](https://arxiv.org/html/2605.20240#bib.bib17)\]\. The Aachen–Jülich–Sussex–PTB collaboration recently demonstrated quantum magnetic imaging of 6000 mAh cells\[[1](https://arxiv.org/html/2605.20240#bib.bib15)\]\. Operando magnetic microscopy of ionic and electronic current distributions appeared in*Nature Communications*in late 2025\[[8](https://arxiv.org/html/2605.20240#bib.bib14)\]\. The trajectory is clear: magnetic sensing is becoming a first\-class modality for battery diagnostics\.

### I\-AThe gap

Despite this momentum, two things are conspicuously missing from the public landscape\. First,*to the best of our knowledge, no public dataset pairs battery magnetic measurements with degradation labels*\. A systematic search of Zenodo, OSF, GitHub, Hugging Face, and the academic literature surfaces exactly one publicly archived raw magnetic\-scan dataset of batteries: the OSF archive associated with Mohammadi, Ilott, and Jerschow \(henceforth “OSF”\)\[[4](https://arxiv.org/html/2605.20240#bib.bib11),[7](https://arxiv.org/html/2605.20240#bib.bib16)\]\. It contains high\-resolution magnetic field measurements of a single Li\-ion cell at five operating voltages, with 41 scan positions per voltage\. It does not include SOH labels, multi\-cell variation, or a degradation trajectory\. Every other publication reporting magnetic measurements of batteries uses proprietary or unreleased data\.

Second,*no bridge connects the rich electrochemical\-degradation datasets to any magnetic\-sensing modality*\. Researchers building methods for magnetic SOH estimation, second\-life classification, or anomaly detection have no public benchmark to develop against\. Cross\-lab comparison is essentially impossible\.

This work asks: can the community make practical progress without waiting for paired magnetic–electrochemical data to become public, by*bridging*the two modalities synthetically?

## IIThe MagBridge\-Battery v1\.0 Dataset

MagBridge\-Battery v1\.0 is the central artifact of this work\. We describe its composition, schema, splits, and integrity properties here\. The bridge architecture that produced it is described in §[III](https://arxiv.org/html/2605.20240#S3); validation in §[IV](https://arxiv.org/html/2605.20240#S4)\.

### II\-AComposition

The release contains6,760 magnetic signaturespartitioned by provenance into three groups \(Fig\.[1](https://arxiv.org/html/2605.20240#S2.F1)\):

- •5,600 PulseBat\-conditioned grounded samples\.Generated by the bridge in the grounded regime \(v∈\[3\.06,3\.34\]v\\in\[3\.06,3\.34\]V\), conditioned on PulseBat\-derived \(SOH, SOC, U\-features\) drawn from real retired\-cell pulse tests\. Every sample carries the full label set\.
- •600 synthetic sensor\-anomaly samples\.Four subtypes, 150 each, derived from clean parents via controlled perturbations:sensor\_dropout,calibration\_drift,temporal\_warp,periodic\_interference\. Each anomaly row carries aparent\_sample\_idpointing to its clean parent\.
- •560 low\-voltage Regime\-B extrapolation samples\.Clustered at three low\-voltage anchors \(nearest\_anchor∈\{2\.54,2\.81,3\.00\}\\in\\\{2\.54,2\.81,3\.00\\\}V\), outside the PulseBat in\-distribution range\.*Regime\-B samples are intended for low\-voltage / OOD / anomaly\-style evaluation, not SOH regression\.*soh,u\_features, andsecond\_life\_classareNaNby design\.

Every sample is a length\-100 sequence with six signal channels plus a constant normalised time axis\. MagBridge\-Battery v1\.0 uses the LFP subset of PulseBat exclusively; NMC and LMO records in PulseBat are reserved for future cross\-chemistry extensions \(§[VII](https://arxiv.org/html/2605.20240#S7)\)\.

MagBridge\-Battery v1\.0 — 6,760 samples5,600 grounded \(82\.8%\)600 \(8\.9%\)560 \(8\.3%\)■\\blacksquareGrounded:PulseBat\-conditioned, in\-distribution voltage, full labels\.■\\blacksquareAnomaly:4 subtypes×\\times150 each, derived from clean parents\.■\\blacksquareRegime\-B:2\.54/2\.81/3\.00 V anchors; SOH = NaN by design\.Primary split\(by\_cell\_primary\)cell\-disjoint, parent\-child leakage\-freetrain:4,507val:1,074test:1,1790cell overlap0parent\-child cross\-split pairsFigure 1:MagBridge\-Battery v1\.0 composition\. The dataset contains 6,760 samples: 5,600 grounded samples, 600 synthetic anomaly samples, and 560 low\-voltage Regime\-B samples\. Bar width is proportional to sample count\. The primary benchmark split isby\_cell\_primary, which is cell\-disjoint and parent\-child leakage\-free\.
### II\-BSchema

Each row carries six signal channels of length 100:

B\_s1Y,B\_s1Z,B\_s2Y,B\_s2Z,B\_s1C5,B\_s2C6\.

The first four are signed Y/Z components of sensors 1 and 2\. The last two are the channel\-5 and channel\-6 fields from the OSF source; the OSF archive labels these channelsMag, but their values are signed and can legitimately be negative \(123 rows contain negative entries inB\_s1C5; 86 inB\_s2C6; minima−80\.47\-80\.47and−94\.08\-94\.08respectively\)\. We rename themC5andC6in the release schema to avoid implying a strictY2\+Z2\\sqrt\{Y^\{2\}\+Z^\{2\}\}magnitude interpretation\.B\_s1C5andB\_s2C6are retained signed source channels and are not interpreted as strict physical magnitudes\.

Atime\_normcolumn carries 100 evenly\-spaced values in\[0,1\]\[0,1\]\. It is the same vector for every sample — a constant reference grid included for loader convenience, droppable without information loss\. Thetemporal\_warpanomaly perturbs signal values on this fixed grid; it does not export an irregular per\-sample timebase\.

Metadata fields include identifiers \(sample\_id,parent\_sample\_id,cell\_id,generation\_seed\), provenance \(bridge\_version,bridge\_config\_hash,schema\_version\), state labels \(voltage,soc,soh,chemistry,regime,nearest\_anchor,u\_features,second\_life\_class\), and anomaly labels \(anomaly\_flag,anomaly\_subtype,anomaly\_origin,anomaly\_severity\)\.

### II\-CBenchmark splits

We provide two splits\.

All primary benchmark results are reported on theby\_cell\_primarysplit, which is cell\-disjoint and parent\-child leakage\-free\. Train / validation / test counts are 4,507 / 1,074 / 1,179 \(Fig\.[1](https://arxiv.org/html/2605.20240#S2.F1)\)\. Verified guarantees: zero physical cells overlap between subsets, zero \(clean parent, anomaly child\) pairs cross subset boundaries, zero sample\-ID overlap\. This is the split to use for any reported number\.

The companionby\_record\_optimistic\_baselinesplit is provided*only*as a contrast and is not recommended for benchmark reporting\. Its leakage is quantified explicitly: 59 cells appear in more than one subset, and 292 parent\-child pairs are split across subset boundaries\. A model trained on this split will appear artificially strong; we ship it so the inflation effect can be measured directly\.

### II\-DIntegrity properties

The release was audited on the shipped artifacts\. Verified: zero duplicatesample\_idvalues across all 6,760 rows; zero duplicate full\-signal hashes; zero NaN or infinity entries in any of the six signal channels; uniform length\-100 signal arrays; exact metadata\-to\-shard ID correspondence; valid parent\-child links for all 600 synthetic anomalies \(all parents exist and are clean\); and zero residual occurrences of deprecated schema fields or legacy anomaly labels\. SHA\-256 checksums for every shipped file are included in the release bundle\.

### II\-EFile layout, license, and citation

The release ships as a single bundle: five Parquet shards of 1,352 rows each, a metadata\-only Parquet view, the two split JSONs, a generation manifest pinning bridge version and configuration hash, a minimal Python loader, SHA\-256 checksums, and licensing files\. The dataset is licensed CC\-BY\-4\.0; release code is licensed Apache\-2\.0; the LICENSE file documents the upstream sources \(OSF magnetometry archive and PulseBat dataset\) and their respective license declarations\.*No raw upstream data is redistributed in this release*; aggregate statistics derived from the OSF archive \(per\-anchor means and variances\) are embedded in the bridge implementation but not in the released data files\.

Users are kindly requested to cite both this paper and the dataset DOI; seeCITING\.mdin the release bundle for the recommended dual citation\. The release is on Zenodo \(DOI:10\.5281/zenodo\.20260147\) and the code is on GitHub at[https://github\.com/SakthiGs/MagBridge\-Battery](https://github.com/SakthiGs/MagBridge-Battery)\.

## IIIBridge Architecture

The bridge is a deterministic functionℬ​\(v,SOC,SOH,𝐮;θ\)→𝐗∈ℝT×C\\mathcal\{B\}\(v,\\text\{SOC\},\\text\{SOH\},\\mathbf\{u\};\\theta\)\\rightarrow\\mathbf\{X\}\\in\\mathbb\{R\}^\{T\\times C\}that maps a generation request — operating voltagevv, state of charge SOC, state of health SOH, and PulseBat U\-feature vector𝐮∈ℝ21\\mathbf\{u\}\\in\\mathbb\{R\}^\{21\}— to a synthetic magnetic\-signature time series of lengthT=100T=100acrossC=6C=6channels \(§[II\-B](https://arxiv.org/html/2605.20240#S2.SS2)\)\. The configurationθ\\thetacollects all tunable parameters, fixed once at bridge instantiation\. Generation is reproducible: a given \(request,θ\\theta, seed\) tuple always produces the same output\.

Figure[2](https://arxiv.org/html/2605.20240#S3.F2)summarises the bridge’s data flow\. The bridge has four components, applied in sequence: a regime classifier, a morphology bank derived from OSF, a degradation modulator conditioned on PulseBat labels via a learned latent representation we callMagBridge\-Embed, and a noise model\. The full mathematical specification of the degradation modulator \(equations for LDA projection, perturbation,kk\-NN softmin decoding, and base–modulated blending\) is provided in Appendix[A](https://arxiv.org/html/2605.20240#A1); we describe the architecture at the conceptual level here\.

OSF magnetometry\(Mohammadi–Jerschow\)1 cell, 5 voltage anchors,41 scans per anchorPulseBat dataset\(Tao et al\.\)464 retired Li\-ion cells,SOH, SOC, U\-featuresMorphology bank𝝁v,𝝈v\\boldsymbol\{\\mu\}\_\{v\},\\boldsymbol\{\\sigma\}\_\{v\}per anchorDegradation modulatorMagBridge\-Embed\(171\-D\)→\\rightarrow4\-D LDA→\\rightarrowcone\-restrictedkk\-NN softmin decodeRegime classifiergrounded / Regime\-BNoise modelsensor noise \+SOC fluctuationMagBridge\-Batteryv1\.06,760 samplesT=100T=100,C=6C=6CC\-BY\-4\.0aggregate statsSOH, SOC,𝐮\\mathbf\{u\}No raw upstream data is redistributed\.Per\-anchor aggregate statistics from OSF and per\-cell label/feature values from PulseBat are used as bridge inputs; only synthetic outputs are released\.Figure 2:Bridge architecture\. Real magnetic morphology from the OSF archive and SOH/SOC/U\-feature labels from PulseBat are combined through a morphology bank, MagBridge\-Embed degradation modulator, regime classifier, and noise model to generate MagBridge\-Battery v1\.0\. Only synthetic outputs are released; no raw upstream files are redistributed\.### III\-ARegime classifier

The bridge handles two operating regimes, derived from cross\-data analysis of the OSF voltage anchors and the PulseBat U\-feature distribution:

- •Grounded regime\(v∈\[3\.06,3\.34\]v\\in\[3\.06,3\.34\]V\): both OSF morphology and PulseBat conditioning populate this range\. The bridge interpolates between the OSF anchors at 3\.10 V and 3\.34 V using PulseBat\-derived \(SOH, SOC\) as conditioning\.
- •Regime B \(extrapolation\)\(v∈\[2\.54,3\.06\)v\\in\[2\.54,3\.06\)V\): only OSF populates this range; PulseBat does not test in over\-discharge for safety reasons\. The bridge reproduces OSF morphology here without PulseBat\-grounded conditioning\. As stated in §[II\-A](https://arxiv.org/html/2605.20240#S2.SS1), Regime\-B samples are intended for low\-voltage / OOD evaluation, not SOH regression\.

Voltages outside\[2\.54,3\.34\]\[2\.54,3\.34\]V are unsupported and rejected\. Every generated sample carries its regime as metadata\.

### III\-BMorphology bank

The OSF archive is canonicalised into per\-anchor empirical statistics: at each of the five anchor voltages\{2\.54,2\.81,3\.00,3\.10,3\.34\}\\\{2\.54,2\.81,3\.00,3\.10,3\.34\\\}V, the bridge extracts the mean trajectory𝝁v∈ℝT×C\\boldsymbol\{\\mu\}\_\{v\}\\in\\mathbb\{R\}^\{T\\times C\}and per\-timestep, per\-channel standard deviation𝝈v∈ℝT×C\\boldsymbol\{\\sigma\}\_\{v\}\\in\\mathbb\{R\}^\{T\\times C\}across all 41 scan positions\. For a generation request at voltagevv, the bank produces a base sample by linear interpolation between bracketing anchors \(full equations in Appendix[A](https://arxiv.org/html/2605.20240#A1)\)\. At an exact OSF anchor with no perturbation, the bank reproduces the empirical mean trajectory to machine precision \(sanity invariant 1, §[IV](https://arxiv.org/html/2605.20240#S4)\)\.

### III\-CDegradation modulator and MagBridge\-Embed

The degradation modulator applies SOH\-driven perturbation to the base morphology through a learned latent representation\. We refer to this representation asMagBridge\-Embed: a 171\-dimensional embedding produced by a fixed quantum reservoir computer\[[3](https://arxiv.org/html/2605.20240#bib.bib18)\]\(10 qubits arranged as 4 memory \+ 6 processor, 2 reservoir layers, pooling\{\\\{last, mean, std\}\\\}\)\. The reservoir parameters are not trained — only the downstream linear readouts that operate in this embedding space are\.

The modulator’s data flow is:

1. 1\.Compute the MagBridge\-Embed of the base sample\.
2. 2\.Project to a 4\-D Linear Discriminant Analysis \(LDA\) subspace fit on real OSF samples with voltage anchors as classes\. In this subspace, within\-anchor scatter is∼1\.78\\sim 1\.78and inter\-anchor separation is∼279\\sim 279\(ratio∼157\\sim 157\)\.
3. 3\.Perturb in the LDA subspace along the fitted state direction with magnitude proportional to\(1−SOH\)\(1\-\\text\{SOH\}\)\.
4. 4\.Decode back to the time domain via cone\-restrictedkk\-NN softmin retrieval against the 205 OSF samples, withk=8k=8and a75∘75^\{\\circ\}alignment cone\.
5. 5\.Blend the decoded morphology with the base sample proportionally to\(1−SOH\)\(1\-\\text\{SOH\}\)\.
6. 6\.Apply per\-channel amplitude scaling and SOH\-scaled spectral broadening; the spectral broadening is mildly modulated by PulseBat U\-feature dispersion\.

Section[V\-C](https://arxiv.org/html/2605.20240#S5.SS3)reports what the validation suite reveals about which of these components are load\-bearing for downstream SOH decoding\.

### III\-DNoise model

Two additive noise sources are applied to the modulator output:*sensor noise*\(zero\-mean Gaussian, per\-channel standard deviation set to 5% of the corresponding OSF anchor’s empirical channel std\), and*SOC\-dependent fluctuation*\(low\-frequency component scaled by\|SOC−50\|/50\|\\text\{SOC\}\-50\|/50, implemented as boxcar\-smoothed Gaussian noise normalised and scaled to 4% of per\-channel range\)\. Ageing\-induced disorder is intentionally not added at this stage to avoid double\-counting with the spectral broadening already applied\.

### III\-ESynthetic sensor anomalies

For the 600 anomaly samples in the v1\.0 release, a clean parent sample is generated by the pipeline above, and one of four perturbations is applied:sensor\_dropout,calibration\_drift,temporal\_warp, orperiodic\_interference\. The clean parent’ssample\_idis preserved asparent\_sample\_idon the anomaly row\. The primary split is constructed so that no parent\-child pair is split across train/val/test boundaries\.

## IVValidation methodology

A bridge that generates plausible\-looking samples is not enough\. We validate the dataset in three nested layers, each of which the bridge must pass before its outputs are released\.

Figure[3](https://arxiv.org/html/2605.20240#S4.F3)summarises the validation and benchmark protocol used for MagBridge\-Battery v1\.0, including structural checks, falsification ablations, and downstream tasks evaluated on the leakage\-safe primary split\.

MagBridge\-Battery v1\.06,760 samplesPrimary splitby\_cell\_primarycell\-disjointparent\-child leakage\-freeReference evaluationsklearn baselinesDL baselinesrepeated cell\-subsamplingValidation and benchmark protocolIntegrity checksunique IDs/hashesvalid parentsno NaN/InfSanity invariantsanchor identitySOH monotonicityvoltage smoothnessDistributional sanitygrounded anchorsKS testsstd/correlation checksControlled ablationsA0 baselineA1 random directionA2 shuffled SOHA3 inverted directionDownstream benchmark tasksT1 SOH regressiongrounded cleanmetric:R2R^\{2\}, MAET2 Second\-lifegrounded cleanmetric: BA, F1, AUCT3 Anomalyclean \+ anomaly\+ Regime\-Bmetric: 3\-class BAT4 Subtypefour anomaly subtypesmetric: 4\-class BA

Figure 3:Benchmark and validation protocol for MagBridge\-Battery v1\.0\. The release is evaluated through integrity checks, bridge sanity invariants, distributional sanity at grounded OSF anchors, controlled ablations, and four downstream benchmark tasks on theby\_cell\_primaryleakage\-safe split\. BA denotes balanced accuracy\.### IV\-ASanity invariants

Five structural invariants are tested automatically on every release candidate; failure on any blocks release\.

1. 1\.Identity at OSF anchors\.At each of the five OSF anchor voltages, the bridge’s deterministic anchor\-replica function \(no perturbation, no noise\) reproduces the empirical mean trajectory to within10−1210^\{\-12\}of machine precision\.
2. 2\.SOH monotonicity\.For fixed \(voltage, SOC, seed\), increasing\(1−SOH\)\(1\-\\text\{SOH\}\)from 0 to 0\.25 produces monotonically increasing L2 distance from the SOH==1\.0 reference in at least one defensible signature metric\.
3. 3\.Voltage smoothness\.A 0\.01 V perturbation in voltage produces a smaller change in output than a 0\.10 V perturbation, ruling out discontinuities at anchor boundaries\.
4. 4\.Regime classification correctness\.The classifier returns the expected regime at boundary cases\{2\.54,2\.81,3\.00,3\.06,3\.10,3\.34\}\\\{2\.54,2\.81,3\.00,3\.06,3\.10,3\.34\\\}V and rejects out\-of\-range tests at\{2\.00,4\.00\}\\\{2\.00,4\.00\\\}V\.
5. 5\.Anomaly flag consistency\.Every Regime\-B sample carriesanomaly\_flag==True; every grounded\-regime clean sample carriesFalse\.

### IV\-BDistributional sanity at grounded anchors

At the two grounded\-regime OSF anchors \(3\.10 V, 3\.34 V\), bridge\-generated samples at SOH==1\.0 should be statistically indistinguishable from real OSF samples\. We generate 41 synthetic samples per anchor \(matching the OSF scan\-position count\) and run per\-channel, per\-timestep two\-sample Kolmogorov–Smirnov tests against real OSF samples\. We additionally report \(a\) the ratio of synthetic to real per\-channel standard deviations as an amplitude calibration check, and \(b\) the mean absolute difference of the synthetic and real cross\-channel correlation matrices\.

### IV\-CControlled ablations

A bridge that passes the invariants and the distributional sanity check could still be a label\-aligned artifact — it could encode the SOH input into the output in a way that any standard readout trivially recovers, without the bridge doing physically meaningful work\. To rule this out, we run four ablation scenarios on identical request lists:

- •A0 baseline: real fitted state direction, real per\-cell SOH labels\.
- •A1 random direction: replace the fitted state direction with a uniformly\-sampled unit vector in the 4\-D LDA subspace\.
- •A2 shuffled SOH labels: permute SOH values across PulseBat records before the bridge sees them, breaking the SOH–cell binding\. The benchmark grades regression against the*original*per\-cell SOH \(preserved in metadata\), not the shuffled values driving generation\.
- •A3 inverted direction: replace the fitted state direction with its negation\.

A2 is the load\-bearing test\. If the bridge is encoding cell\-state\-truthful information, A2 should collapse downstream regressionR2R^\{2\}to approximately zero\. A1 and A3 jointly probe whether the specific fitted LDA direction is privileged for downstream decoding\.

### IV\-DRelationship of ablation pilots to the v1\.0 release

The four\-scenario ablations were executed on focused∼\\sim310\-sample\-per\-scenario pilots produced by the v1\.0 bridge architecture in the grounded regime\. The released v1\.0 dataset \(§[II](https://arxiv.org/html/2605.20240#S2)\) is generated by the*same*v1\.0 bridge with the*same*configurationθ\\theta\(pinned bybridge\_config\_hashin the manifest\), scaled to 5,600 grounded samples plus the 600 anomalies and 560 Regime\-B samples\. Because the released dataset is generated by the same bridge configuration, the ablation pilots provide architecture\-level evidence for the released dataset’s behaviour; we do not re\-run ablations at full scale because direction\-perturbation contrasts are already statistically saturated at the pilot size\.

### IV\-EBenchmark tasks

For downstream evaluation we define three tasks:

- •T1 SOH regression: continuous SOH \(released range\[0\.744,0\.962\]\[0\.744,0\.962\], reflecting PulseBat retired\-cell SOH distribution\) from the magnetic signature\. Grounded\-regime samples only\.
- •T2 Second\-life classification: binary classification with the cutoff at SOH=0\.85=0\.85\(reuseif SOH\>0\.85\>0\.85,reconditionotherwise\), matching the convention used in the released metadata’ssecond\_life\_classfield\. Grounded\-regime samples only\.
- •T3 Anomaly detection: train on grounded\-regime clean samples, test on a mix of held\-out grounded clean, synthetic\-anomaly perturbed, and Regime\-B samples\.

All three tasks use theby\_cell\_primarysplit \(§[II\-C](https://arxiv.org/html/2605.20240#S2.SS3)\) with five seeds, and few\-shot stratified subsampling atk∈\{2,5,10,20\}k\\in\\\{2,5,10,20\\\}examples per class \(T1, T2\) or examples per training pool \(T3\)\. Reference models are standard sklearn pipelines: ridge regression, SVR\-RBF, and random forest for T1; logistic regression, ridge classifier, linear SVC, and random forest for T2 and T3\. The benchmark suite uses a 57\-feature static descriptor \(9 features per channel plus 3 cross\-channel correlations\), which reproduces a reference OSF feature implementation to1\.87×10−121\.87\\times 10^\{\-12\}machine precision, ensuring that any difference between bridge results and a reference experiment on real OSF data is attributable to the bridge itself, not feature drift\.

## VResults

### V\-ASanity invariants

All five sanity invariants pass\. Identity at OSF anchors: maximum absolute deviation1\.87×10−121\.87\\times 10^\{\-12\}\. SOH monotonicity: distance from SOH==1\.0 reference grows monotonically acrossSOH∈\{1\.00,0\.95,0\.90,0\.85,0\.80,0\.75\}\\text\{SOH\}\\in\\\{1\.00,0\.95,0\.90,0\.85,0\.80,0\.75\\\}at both 3\.10 V and 2\.81 V\. Voltage smoothness: a 0\.01 V step produces L2 change≈261\\approx 261vs\.≈1881\\approx 1881for a 0\.10 V step\. Regime classification and anomaly flag invariants pass on all boundary cases\.

### V\-BDistributional sanity at grounded anchors

Per\-channel KS tests pass on 6 of 6 channels at both grounded anchors \(3\.10 V, 3\.34 V\) atp\>0\.05p\>0\.05\. Per\-channel standard\-deviation ratios \(synthetic / real\) lie in\[0\.98,1\.02\]\[0\.98,1\.02\], indicating essentially exact amplitude calibration\. The mean absolute cross\-channel correlation difference is 0\.46 at 3\.10 V and 0\.35 at 3\.34 V; the v1 bridge’s independent\-channel\-perturbation assumption is the source of this gap and is a known limitation \(§[VII](https://arxiv.org/html/2605.20240#S7)\)\. Qualitatively, bridge\-generated samples at SOH==1\.0 overlay almost exactly onto real OSF samples at both grounded anchors across all six channels; the bridge captures the characteristic dip\-spike feature near timestep≈45\\approx 45in channelsB\_s1C5andB\_s2C6, and per\-anchor amplitude relationships across the six channels are preserved\.

### V\-CAblation results

Headline results from the controlled ablations are summarised in Tables[I](https://arxiv.org/html/2605.20240#S5.T1)and[II](https://arxiv.org/html/2605.20240#S5.T2)\.

TABLE I:SOH regressionR2R^\{2\}\(best model per shot, by\-cell test split, 5\-seed mean\) on bridge v1\.0 ablation pilots\.TABLE II:Second\-life classification balanced accuracy \(best model per shot, by\-cell test split, 5\-seed mean\)\. Chance is 0\.50\.Two patterns are visible\.

A2 \(shuffled SOH\) collapses cleanly\.SOH regressionR2R^\{2\}drops from\+0\.77\+0\.77to−0\.04\-0\.04atk=20k=20, a swing of more than 0\.8\. Second\-life accuracy falls to chance across all shot levels\. Both tasks confirm that the bridge encodes input SOH non\-trivially: when the SOH–cell binding is broken at generation time, downstream readouts cannot recover SOH from the synthetic signatures\. This is the dataset’s primary validation: it is not producing label\-aligned artefacts\.

A1 and A3 perform almost identically to A0\.Across all four shot levels and both tasks, random and inverted perturbation directions match the principled\-direction baseline within statistical noise \(the largest cross\-scenario gap is 0\.05R2R^\{2\}\)\. We address what this means in §[VI\-B](https://arxiv.org/html/2605.20240#S6.SS2)\.

### V\-DFull\-release benchmark onby\_cell\_primary

The ablation pilots of §[V\-C](https://arxiv.org/html/2605.20240#S5.SS3)characterise the bridge’s response to controlled perturbations\. We additionally report headline numbers on the full released v1\.0 dataset under the primary leakage\-safe split \(Table[III](https://arxiv.org/html/2605.20240#S5.T3)\), using standard sklearn pipelines and reporting the best model per task across five repeated cell\-subsampling seeds \(each seed sees a different 80% of the training cells\)\. These are the numbers users should compare against when developing methods on MagBridge\-Battery\.

TABLE III:Full MagBridge\-Battery v1\.0 benchmark on theby\_cell\_primarycell\-disjoint split\. Best model per task; 5 cell\-subsampling seeds; mean±\\pmstd\. Models considered: Ridge / SVR\-RBF / RF for T1; LogReg / RidgeCls / LinSVC / RF for T2–T4\.Note:BA denotes balanced accuracy\.

T4 \(anomaly subtype classification — distinguishing among the four subtypessensor\_dropout,calibration\_drift,temporal\_warp, andperiodic\_interference\) is reported as an additional benchmark task\. A random forest achieves0\.725±0\.0280\.725\\pm 0\.028on 4\-way classification \(chance=0\.25=0\.25\), substantially above chance but with clear headroom for stronger readouts \(sequence models, learned embeddings\)\. T1 also leaves significant headroom \(R2of≈0\.68\\approx 0\.68with ridge regression on static features\), suggesting that sequence\-aware methods could meaningfully improve SOH regression\. T2 is nearly saturated by classical baselines because second\-life classification is essentially a thresholded view of SOH; once T1 is solved well, T2 follows\. T3 sits in the middle, indicating that anomaly detection on this dataset is genuinely non\-trivial under the cell\-disjoint protocol but not intractable\.

A note on the T3 anomaly\-detection metric\. T3 is a 3\-way task under the released benchmark protocol: synthetic\-anomaly samples are derived from clean parents and differ from them only by the perturbation type, while Regime\-B samples differ from grounded\-regime samples primarily by voltage\. A simplified regime\-separation sanity check using a random forest can reachAUROC≈1\.00\\text\{AUROC\}\\approx 1\.00when only the Regime\-B\-vs\-grounded distinction is being decided, but we do not treat this as the primary benchmark, because it reflects strong voltage\-anchor separation rather than discriminative learning of anomaly structure\. The headline T3 balanced accuracy of0\.7890\.789in Table[III](https://arxiv.org/html/2605.20240#S5.T3)is computed on the full 3\-way protocol that requires distinguishing perturbed\-grounded, clean\-grounded, and Regime\-B samples on the cell\-disjoint test split\.

### V\-EDeep\-learning baselines

We additionally evaluated three off\-the\-shelf neural baselines to quantify whether the raw length\-100 signatures are immediately exploitable by standard sequence models: a 3\-layer MLP on the flattened raw signal, a 3\-block 1D CNN, and a single\-layer LSTM\. All three operate directly on the six\-channel sequences after channel\-wise standardization using training\-set statistics\. We trained with Adam \(learning rate10−310^\{\-3\}, weight decay10−410^\{\-4\}, batch size 64\) for up to 15 epochs with early stopping\. We did not tune architecture depth, receptive field, or class weighting per individual task; these baselines are intended as reproducible off\-the\-shelf references rather than tuned ceilings\. Results across three repeated cell\-subsampling seeds are shown in Table[IV](https://arxiv.org/html/2605.20240#S5.T4)\.

TABLE IV:Off\-the\-shelf deep\-learning baselines on theby\_cell\_primarycell\-disjoint split\. Results are mean±\\pmstandard deviation over three repeated cell\-subsampling seeds\. The best classical baseline from Table[III](https://arxiv.org/html/2605.20240#S5.T3)is reproduced for comparison\.Note:BA denotes balanced accuracy\. Bold indicates the best deep\-learning baseline for each classification task\.

The neural baselines establish a useful lower bound for direct sequence\-learning methods rather than a tuned deep\-learning ceiling\. The best neural models approach the static\-feature classical baselines on T2 and T3: the 1D CNN reaches0\.8580\.858balanced accuracy on second\-life classification compared with0\.9070\.907for RidgeCls, and the MLP reaches0\.7550\.755balanced accuracy on the 3\-class anomaly task compared with0\.7890\.789for the random forest\. In contrast, the gap is substantial on T1 and T4: the best neural model reaches onlyR2=0\.17R^\{2\}=0\.17on SOH regression compared with0\.6750\.675for ridge regression, and0\.3550\.355balanced accuracy on anomaly\-subtype classification compared with0\.7250\.725for the random forest\. The MLP shows unstable behaviour on T1 \(negativeR2R^\{2\}with high variance\), consistent with overfit on the flattened raw signal under limited training data\.

We interpret this gap as a benchmark signal rather than a dataset deficiency\. The 57\-feature static descriptor used by the classical baselines encodes compact domain summaries — channel means, variation, energy, slopes, and cross\-channel correlations — that are effective at this release size and split\. Closing the gap likely requires representation\-learning methods with stronger inductive bias, pretraining, contrastive objectives, or sequence encoders tuned specifically for magnetic\-signature morphology\. MagBridge\-Battery is therefore not solved by off\-the\-shelf application of standard deep\-learning architectures, while leaving clear headroom for stronger learned representations\.

## VIDiscussion

### VI\-AWhat the A2 collapse establishes

When PulseBat SOH labels are randomly permuted across cells before the bridge sees them, downstream SOH regression collapses fromR2≈0\.77R^\{2\}\\approx 0\.77to≈0\\approx 0\. This makes the alternative explanations — bridge\-generated samples being SOH\-decodable through correlated covariates, dataset leakage, or label memorisation — unlikely under the tested protocol\. In practical terms, a researcher training an SOH estimator on MagBridge\-Battery is using bridge\-consistent SOH\-conditioned signal rather than a purely label\-aligned artefact\.

### VI\-BWhat A1≈\\approxA0 means

The data show that random and inverted perturbation directions in the 4\-D LDA subspace produce SOH\-decoding accuracy within∼0\.05\\sim 0\.05R2R^\{2\}of the principled fitted direction\. The architectural reason is geometric: the OSF dataset contains exactly five anchor clusters in the 4\-D subspace, separated by∼280\\sim 280units with within\-cluster scatter∼1\.8\\sim 1\.8\(a separation ratio of 157\)\. Any perturbation of meaningful magnitude moves the embedding into a different anchor’s neighbourhood, regardless of direction\. The downstream SOH\-decoding signal is generated by the resulting blend, with amplitude scaling proportional to\(1−SOH\)\(1\-\\text\{SOH\}\)\.

In other words, the bridge produces decodable SOH content via*anchor blending under SOH\-driven blend amplitude*— a property inherited from the bank’s small, sharply\-clustered geometry, not from any specific embedding direction\. We treat this as a property of retrieve\-and\-blend bridge architectures over small morphology banks, of which MagBridge\-Battery is one\. A more flexible decoder \(e\.g\. a conditional generative model trained on OSF data\) would in principle let the conditioning direction shape output distributions non\-trivially, and would change this finding; we treat this as the most promising single\-axis future\-work direction \(§[VII](https://arxiv.org/html/2605.20240#S7)\)\.

### VI\-CImplications for synthetic\-data work in this field

Two implications follow\. First, bridge architectures that retrieve from a finite morphology bank inherit the bank’s geometry as a structural prior\. Richer banks — expanded by collaborator\-collected real measurements, augmented by physics\-simulation, or replaced by a learned generative model — would directly test whether more anchor diversity creates room for direction\-of\-perturbation to matter\.

Second, the controlled\-ablation methodology developed here is portable\. The four\-scenario design \(baseline, random direction, label shuffle, inverted direction\) requires no domain\-specific knowledge once the bridge has explicit conditioning inputs and a perturbation mechanism\. We recommend it as a default validation protocol for synthetic\-data work that claims to encode any kind of label structure into outputs\.

### VI\-DComparison to the QuaLiProM trajectory

The QuaLiProM consortium\[[2](https://arxiv.org/html/2605.20240#bib.bib17)\]is building, in private, the kind of dataset MagBridge\-Battery attempts to substitute for: paired magnetic measurements with electrochemical degradation labels at industrial scale\. Our work and theirs occupy different positions on a spectrum: they pay the cost of measurement and gain ground\-truth signal; we pay the cost of architectural assumptions and gain public availability\. When QuaLiProM\-style data eventually becomes public, MagBridge\-Battery becomes immediately falsifiable in a constructive sense — one can compare bridge outputs to real measurements at matched \(SOH, SOC, voltage\) tuples and quantify the mismatch\.

## VIILimitations and intended use

### VII\-AIntended uses

MagBridge\-Battery v1\.0 is intended for:

- •BenchmarkSOH regressionon grounded\-regime clean samples \(T1\)\.
- •Benchmarksecond\-life classificationon grounded\-regime clean samples \(T2\)\.
- •Benchmarkanomaly detectionusing the 600 synthetic anomalies and the 560 Regime\-B extrapolation samples \(T3\)\.
- •StudyingOOD extrapolationvia the three Regime\-B voltage anchors\.
- •Quantifyingsplit\-leakage effectsby training on both shipped splits and reporting the gap\.

### VII\-BOut\-of\-scope uses

- •SOH regression on Regime\-B samples \(sohisNaNby design\)\.
- •TreatingB\_s1C5orB\_s2C6as physical magnitudes\.
- •Treatingtime\_normas per\-sample timing information\.
- •Cross\-chemistry transfer — only LFP is represented in v1\.0\.
- •Substituting MagBridge\-Battery for real magnetic\-sensing measurements in safety\-critical decisions\.

### VII\-CKnown limitations

Single\-chemistry, single\-cell coverage\.The v1\.0 release uses LFP cells exclusively, and the OSF morphology bank is from one physical cell\. The bridge therefore captures one cell’s morphology, with no inter\-cell variability in the magnetic signature\.

No ground\-truth external validation\.The validation we report is structural \(sanity invariants\), distributional \(KS tests at anchors\), and ablation\-based \(A2\)\. What we cannot do, by construction, is compare bridge outputs to real magnetic measurements from cells with known SOH labels — because no such public data exists\. The benchmark numbers reported here have meaning as internal metrics under documented bridge assumptions; their generalisation to real magnetic data will only be testable when paired data becomes available\. Every generated sample carries its full provenance so that a future researcher with paired data can identify matching real samples and compute residuals directly\.

Regime\-B carries no SOH label\.By design, the 560 Regime\-B extrapolation samples havesoh,u\_features, andsecond\_life\_classset toNaN\. Users should filter toregime == "grounded"before fitting T1 / T2 models\.

Architectural ceiling from retrieve\-and\-blend decoder\.As discussed in §[VI\-B](https://arxiv.org/html/2605.20240#S6.SS2), the A1≈\\approxA0 finding partly reflects a property of the retrieve\-and\-blend decoder\. A conditional generative\-model decoder would in principle let SOH conditioning shape outputs non\-trivially even with limited training data; this is the most promising single\-axis future direction\.

Channel correlation structure\.The v1 bridge family treats per\-channel perturbations as independent in time\-domain noise application\. At grounded anchors, the mean absolute difference between bridge and real channel\-correlation matrices is approximately 0\.4\. This does not affect headline downstream\-task results but is a known imperfection\.

Constant time axis\.time\_normis the same vector for every sample \(a fixed reference grid\)\. It is included for loader convenience and carries no per\-sample information\.

SOC coverage\.PulseBat samples cluster at SOC∈\{5,10,…,50\}%\\in\\\{5,10,\\ldots,50\\\}\\%\. Conditioning fidelity outside this range is untested\.

### VII\-DFuture\-work directions

In rough order of cost\-to\-benefit:

1. 1\.Conditional generative\-model decoder \(replace retrieve\-and\-blend with a small VAE or diffusion model on OSF samples\)\.
2. 2\.Lab collaboration for paired data — even 50–100 cells with paired SOH labels and OPM measurements would enable ground\-truth validation\.
3. 3\.Synthetic magnetic anchors via FEM simulation \(expand the morphology bank\)\.
4. 4\.Chemistry\-transfer extension once paired data exists at any chemistry beyond LFP\.
5. 5\.Integration with QuaLiProM\-style data when it becomes public\.

## VIIIConclusion

We releasedMagBridge\-Battery v1\.0, a 6,760\-sample synthetic dataset bridging the OSF magnetic\-morphology archive with PulseBat electrochemical degradation labels, motivated by the absence of any public dataset pairing magnetic measurements with state\-of\-health information\. The release comprises 5,600 PulseBat\-conditioned grounded samples, 600 synthetic sensor\-anomaly samples derived from clean parents, and 560 low\-voltage Regime\-B extrapolation samples, together with a cell\-disjoint, parent\-child\-leakage\-free primary benchmark split\. A label\-shuffle controlled ablation confirms that the bridge encodes input SOH into output morphology non\-trivially: SOH regression collapses fromR2≈0\.77R^\{2\}\\approx 0\.77to≈0\\approx 0when the SOH–cell binding is broken at generation time\. A direction\-perturbation ablation establishes a complementary architectural finding: in a retrieve\-and\-blend bridge over a small, sharply\-clustered morphology bank, the principled latent direction is not privileged for downstream decoding, because the bank’s geometry dominates the signal\. The dataset, bridge implementation, and a benchmark suite covering SOH regression, second\-life classification, and anomaly detection are released to support method development while public paired magnetic–electrochemical data remains unavailable\.

## Code and data availability

MagBridge\-Battery v1\.0 is released on Zenodo under CC\-BY\-4\.0 \(DOI:10\.5281/zenodo\.20260147\)\. The bridge implementation, ablation scripts, and benchmark suite are released on GitHub under Apache\-2\.0 at[https://github\.com/SakthiGs/MagBridge\-Battery](https://github.com/SakthiGs/MagBridge-Battery)\. The release bundle contains data shards \(Parquet\), metadata, both benchmark splits, the generation manifest pinning bridge version and configuration hash, a minimal Python loader, SHA\-256 checksums, a citation file, dataset documentation, and a LICENSE file documenting the upstream sources \(OSF magnetometry archive, PulseBat dataset\) and their respective license declarations\.

## How to cite

Users of MagBridge\-Battery are kindly requested to cite both the dataset DOI and this paper\. The dataset DOI uniquely identifies the v1\.0 data artifact; the paper documents the bridge construction, validation, and benchmark protocol\. SeeCITING\.mdin the release bundle for copy\-paste BibTeX\.

## Acknowledgements

The authors thank the Mohammadi–Jerschow group for openly releasing the OSF battery magnetometry archive, which made this work possible\. We also thank Prof\. Alexej Jerschow for helpful clarification regarding the public OSF dataset\. The authors further thank the PulseBat team \(Tao et al\.\) for releasing pulse\-voltage response data on retired Li\-ion cells\. Any errors or interpretations in this work remain the responsibility of the authors\.

## Appendix ABridge degradation modulator: full equations

For completeness we list the equations referenced in §[III\-C](https://arxiv.org/html/2605.20240#S3.SS3)\. The morphology bank produces a base sample at voltagevvby linear interpolation between bracketing anchorsvl≤v≤vhv\_\{l\}\\leq v\\leq v\_\{h\}:

𝝁​\(v\)=\(1−α\)​𝝁vl\+α​𝝁vh,α=v−vlvh−vl\\boldsymbol\{\\mu\}\(v\)=\(1\-\\alpha\)\\,\\boldsymbol\{\\mu\}\_\{v\_\{l\}\}\+\\alpha\\,\\boldsymbol\{\\mu\}\_\{v\_\{h\}\},\\quad\\alpha=\\frac\{v\-v\_\{l\}\}\{v\_\{h\}\-v\_\{l\}\}\(1\)𝝈​\(v\)=\(1−α\)​𝝈vl2\+α​𝝈vh2\\boldsymbol\{\\sigma\}\(v\)=\\sqrt\{\(1\-\\alpha\)\\,\\boldsymbol\{\\sigma\}\_\{v\_\{l\}\}^\{2\}\+\\alpha\\,\\boldsymbol\{\\sigma\}\_\{v\_\{h\}\}^\{2\}\}\(2\)𝐗base=𝝁​\(v\)\+𝝈​\(v\)⊙ϵ,ϵ∼𝒩​\(0,I\)\\mathbf\{X\}\_\{\\text\{base\}\}=\\boldsymbol\{\\mu\}\(v\)\+\\boldsymbol\{\\sigma\}\(v\)\\odot\\boldsymbol\{\\epsilon\},\\quad\\boldsymbol\{\\epsilon\}\\sim\\mathcal\{N\}\(0,I\)\(3\)The degradation modulator computes the MagBridge\-Embed of the base sample, projects to LDA space:

𝐳base=\(𝚺−1/2​\(𝐞base−𝝁e\)−𝐱¯\)​𝐖LDA\\mathbf\{z\}\_\{\\text\{base\}\}=\(\\boldsymbol\{\\Sigma\}^\{\-1/2\}\(\\mathbf\{e\}\_\{\\text\{base\}\}\-\\boldsymbol\{\\mu\}\_\{e\}\)\-\\bar\{\\mathbf\{x\}\}\)\\mathbf\{W\}\_\{\\text\{LDA\}\}\(4\)perturbs along the fitted state direction𝐝^\\hat\{\\mathbf\{d\}\}:

𝐳pert=𝐳base\+m⋅𝐝^,m=−γ⋅δ,δ=max⁡\(0,1−SOH\)\\mathbf\{z\}\_\{\\text\{pert\}\}=\\mathbf\{z\}\_\{\\text\{base\}\}\+m\\cdot\\hat\{\\mathbf\{d\}\},\\quad m=\-\\gamma\\cdot\\delta,\\quad\\delta=\\max\(0,1\-\\text\{SOH\}\)\(5\)decodes via cone\-restrictedkk\-NN softmin:

𝐗decode=∑i∈𝒩kwi⋅𝐗iOSF,wi=exp⁡\(−di/τ\)∑jexp⁡\(−dj/τ\)\\mathbf\{X\}\_\{\\text\{decode\}\}=\\sum\_\{i\\in\\mathcal\{N\}\_\{k\}\}w\_\{i\}\\cdot\\mathbf\{X\}\_\{i\}^\{\\text\{OSF\}\},\\quad w\_\{i\}=\\frac\{\\exp\(\-d\_\{i\}/\\tau\)\}\{\\sum\_\{j\}\\exp\(\-d\_\{j\}/\\tau\)\}\(6\)and blends with the base proportionally toδ\\delta:

𝐗mod=\(1−β\)​𝐗base\+β​𝐗decode,β=min⁡\(1,1\.5​δ\)\\mathbf\{X\}\_\{\\text\{mod\}\}=\(1\-\\beta\)\\,\\mathbf\{X\}\_\{\\text\{base\}\}\+\\beta\\,\\mathbf\{X\}\_\{\\text\{decode\}\},\\quad\\beta=\\min\(1,1\.5\\delta\)\(7\)with v1\.0 release configurationγ=800\\gamma=800,k=8k=8,τ=50\\tau=50, cone half\-angle75∘75^\{\\circ\}\. The bridge configuration is pinned by hash in the released manifest\.

## References

- \[1\]W\. Evans, T\. Coussens, M\. T\. M\. Woodley, A\. M\. Fabricant, G\. D\. Kendall, M\. Sonnet, D\. Wasylowski, D\. U\. Sauer, F\. Oručević, and P\. Krüger\(2025\)Quantum magnetic imaging of current density in lithium\-ion batteries\.arXiv preprint\.External Links:2512\.01125,[Document](https://dx.doi.org/10.48550/arXiv.2512.01125),[Link](https://arxiv.org/abs/2512.01125)Cited by:[§I](https://arxiv.org/html/2605.20240#S1.p2.1),[§I](https://arxiv.org/html/2605.20240#S1.p3.1)\.
- \[2\]Fraunhofer IFAM and FAU Erlangen\-Nürnberg and Industrial Dynamics GmbH and project partners\(2025\)QuaLiProM: quality control and second\-life applications of lithium\-ion batteries using atomic magnetometry and AI\.Note:BMBF\-funded research consortiumProject running through November 2026; project data not public at time of writingCited by:[§I](https://arxiv.org/html/2605.20240#S1.p3.1),[§VI\-D](https://arxiv.org/html/2605.20240#S6.SS4.p1.1)\.
- \[3\]K\. Fujii and K\. Nakajima\(2017\)Harnessing disordered\-ensemble quantum dynamics for machine learning\.Physical Review Applied8\(2\),pp\. 024030\.External Links:[Document](https://dx.doi.org/10.1103/PhysRevApplied.8.024030)Cited by:[§III\-C](https://arxiv.org/html/2605.20240#S3.SS3.p1.2)\.
- \[4\]Y\. Hu, G\. Z\. Iwata, M\. Mohammadi, E\. V\. Silletta, A\. Wickenbrock, J\. W\. Blanchard, D\. Budker, and A\. Jerschow\(2020\)Sensitive magnetometry reveals inhomogeneities in charge storage and weak transient internal currents in Li\-ion cells\.Proceedings of the National Academy of Sciences117\(20\),pp\. 10667–10672\.External Links:[Document](https://dx.doi.org/10.1073/pnas.1917172117)Cited by:[§I\-A](https://arxiv.org/html/2605.20240#S1.SS1.p1.1),[§I](https://arxiv.org/html/2605.20240#S1.p2.1)\.
- \[5\]A\. J\. Ilott and A\. Jerschow\(2017\)Super\-resolution surface microscopy of conductors using magnetic resonance\.Scientific Reports7,pp\. 5425\.External Links:[Document](https://dx.doi.org/10.1038/s41598-017-05878-w)Cited by:[§I](https://arxiv.org/html/2605.20240#S1.p2.1)\.
- \[6\]A\. J\. Ilott, M\. Mohammadi, C\. M\. Schauerman, M\. J\. Ganter, and A\. Jerschow\(2018\)Rechargeable lithium\-ion cell state of charge and defect detection by in\-situ inside\-out magnetic resonance imaging\.Nature Communications9\(1\),pp\. 1776\.External Links:[Document](https://dx.doi.org/10.1038/s41467-018-04192-x)Cited by:[§I](https://arxiv.org/html/2605.20240#S1.p2.1)\.
- \[7\]M\. Mohammadi and A\. Jerschow\(2020\)Battery magnetometry data 2019–2020 \(OSF\)\.Note:Open Science FrameworkBattery magnetometry data archiveExternal Links:[Document](https://dx.doi.org/10.17605/OSF.IO/CW8ZV),[Link](https://osf.io/cw8zv/)Cited by:[§I\-A](https://arxiv.org/html/2605.20240#S1.SS1.p1.1)\.
- \[8\]S\. Pollok, M\. Khoshkalam, F\. Ghaffari\-Tabrizi, F\. Kurnia, D\. Wang, S\. Li, D\. B\. Bucher, J\. L\. M\. Rupp, and D\. V\. Christensen\(2025\)Magnetic microscopy for operando imaging of battery dynamics\.Nature Communications16\(1\),pp\. 8303\.External Links:[Document](https://dx.doi.org/10.1038/s41467-025-63409-y)Cited by:[§I](https://arxiv.org/html/2605.20240#S1.p2.1),[§I](https://arxiv.org/html/2605.20240#S1.p3.1)\.
- \[9\]K\. Romanenko, P\. W\. Kuchel, and A\. Jerschow\(2020\)Accurate visualization of operating commercial batteries using specialized magnetic resonance imaging with magnetic field sensing\.Chemistry of Materials32\(5\),pp\. 2107–2113\.External Links:[Document](https://dx.doi.org/10.1021/acs.chemmater.9b05246)Cited by:[§I](https://arxiv.org/html/2605.20240#S1.p2.1)\.
- \[10\]K\. A\. Severson, P\. M\. Attia, N\. Jin, N\. Perkins, B\. Jiang, Z\. Yang, M\. H\. Chen, M\. Aykol, P\. K\. Herring, D\. Fraggedakis, M\. Z\. Bazant, S\. J\. Harris, W\. C\. Chueh, and R\. D\. Braatz\(2019\)Data\-driven prediction of battery cycle life before capacity degradation\.Nature Energy4\(5\),pp\. 383–391\.External Links:[Document](https://dx.doi.org/10.1038/s41560-019-0356-8)Cited by:[§I](https://arxiv.org/html/2605.20240#S1.p1.1)\.
- \[11\]S\. Tao, R\. Ma, Z\. Zhao, G\. Ma, L\. Su, H\. Chang, Y\. Chen, H\. Liu, Z\. Liang, T\. Cao, H\. Ji, Z\. Han, M\. Lu, H\. Yang, Z\. Wen, J\. Yao, R\. Yu, G\. Wei, Y\. Li, X\. Zhang, T\. Xu, and G\. Zhou\(2024\)Generative learning assisted state\-of\-health estimation for sustainable battery recycling with random retirement conditions\.Nature Communications15,pp\. 10154\.Note:Includes the PulseBat datasetExternal Links:[Document](https://dx.doi.org/10.1038/s41467-024-54454-0)Cited by:[§I](https://arxiv.org/html/2605.20240#S1.p1.1)\.

Similar Articles

BatteryMFormer: Multi-level Learning for Battery Degradation Trajectory Forecasting

arXiv cs.AI

The paper proposes BatteryMFormer, a multi-level Transformer for early battery degradation trajectory forecasting that integrates aging-condition-aware decoding, meta degradation pattern memory, and dual-view encoding to capture multi-level degradation structures and SOC-localized variations, consistently outperforming state-of-the-art baselines across four battery domains.

The Metacognitive Monitoring Battery: A Cross-Domain Benchmark for LLM Self-Monitoring

arXiv cs.CL

A new cross-domain benchmark (Metacognitive Monitoring Battery) with 524 items evaluates LLM self-monitoring capabilities across six cognitive domains using human psychometric methodology. Applied to 20 frontier LLMs, it reveals three distinct metacognitive profiles and shows that accuracy rank and metacognitive sensitivity rank are largely inverted.