A Simulated Federated Analysis of MS-Induced Brain Lesions
Summary
This paper introduces a simulation framework for federated analysis of Multiple Sclerosis brain lesions, combining image segmentation with clinical data analysis to test federated learning methods while preserving patient privacy.
View Cached Full Text
Cached at: 05/12/26, 07:08 AM
# A Simulated Federated Analysis of MS-Induced Brain Lesions
Source: [https://arxiv.org/html/2605.08223](https://arxiv.org/html/2605.08223)
###### Abstract
Federated techniques such as federated learning and federated analysis have emerged as a powerful paradigm for enabling multi\-center research on sensitive clinical data while preserving patient privacy\. In this study, we introduce a simulation framework that emulates a real\-world federated research project focused on the analysis of multiple sclerosis \(MS\) patient data\. The project comprises two components: an image segmentation task and a clinical data analysis task, where federated variants of survival analysis and Principal Component Analysis \(PCA\) are employed\. To capture the complexity and heterogeneity of real clinical datasets, we construct a federation of high\-fidelity synthetic cohorts designed to mirror MS\-related clinical and demographic characteristics, while the imaging component leverages publicly available real\-world datasets\.
Our simulation replicates key elements of authentic federated workflows, including distributed data governance, site\-specific preprocessing, model training across isolated nodes, and the secure aggregation of analytical outputs\. This framework provides a realistic testbed for developing, evaluating, and benchmarking federated learning methods in the context of MS research\.
## IIntroduction
This manuscript presents a simulated analysis of Multiple Sclerosis disease progression using both image and clinical data\. The analysis presented here is inspired by outcomes from the INTONATE network\[[17](https://arxiv.org/html/2605.08223#bib.bib4),[19](https://arxiv.org/html/2605.08223#bib.bib22)\]\. The INTONATE\-MS consortium is a public–private research consortium between Universitätsklinikum Münster, Penn Medicine, Unity Health Toronto, Erasmus MC and Roche\. It constitutes a collaborative federated research framework that integrates large\-scale, multicenter clinical trial data with real\-world evidence \(RWE\) to enhance the understanding and management of multiple sclerosis \(MS\)\. In particular, the image analysis follows the federated image segmentation study\[[10](https://arxiv.org/html/2605.08223#bib.bib2)\]and the statistical part is based on the multi\-center integration study\[[18](https://arxiv.org/html/2605.08223#bib.bib3)\]as part of INTONATE\.
In this paper, we demonstrate the interplay of image analysis, statistical inference, and survival analysis in a federated setting\. We present an end\-to\-end workflow for multimodal, multi\-center analysis that can provide a valuable contribution to drug development\.
The image data used in this study comes from public datasets\[[7](https://arxiv.org/html/2605.08223#bib.bib1)\], while all clinical datasets are fully synthetic\. Although the public image dataset includes clinical tables as well, we chose to generate artificial data to allow a wider variety of statistical methods and effects to be demonstrated\.
The clinical data can be mapped to an OMOP CDM\. Clinical measures, relapses and symptoms can be aligned across observation and measurement tables, while demographics and disease attributes can be mapped from person, condition, and drug exposure tables\. A detailed mapping of source variables to their corresponding OMOP CDM tables is provided in Table[I](https://arxiv.org/html/2605.08223#S1.T1)\.
TABLE I:Mapping of source variables to OMOP CDM tables\.Some statistical patterns were intentionally designed for demonstration and do not reflect real clinical scenarios\. This paper demonstrates how federated statistics and federated machine learning can extract insights from data that are not directly accessible and distributed across multiple sites\. In particular, we demonstrate that a federated analysis on an entire statistical ensemble can reveal patterns that are not visible when considering only isolated subsets\.
## IIFederated Analytics
Federated analytics\[[6](https://arxiv.org/html/2605.08223#bib.bib7)\]is a computer\-based system paradigm designed to enable joint analysis across distributed and sensitive medical datasets without requiring data centralization\. In clinical and biomedical research settings, datasets are often siloed across institutions due to privacy, regulatory, and governance constraints, limiting the applicability of traditional centralized analytics pipelines\. Federated analytics addresses this challenge by executing analytical computations locally at each data\-holding site and sharing only intermediate results or aggregates for global analysis\. Even though this approach requires a longer compute time, since the federated architecture entails additional communication overhead, the advantages are nevertheless evident, as data sources can be unlocked that would otherwise not be eligible for analysis\. From a computer\-based medical systems perspective, the primary contribution lies in system architecture, orchestration, and secure computation\. The approach does not focus on automated clinical decision\-making\. This makes federated analytics particularly suitable for multi\-center studies, drug development, and medical research infrastructures\.
### II\-AArchitecture
Figure[1](https://arxiv.org/html/2605.08223#S2.F1)illustrates the general architecture of a federated learning network as implemented by Apheris\.
Figure 1:Apheris federated architecture\[[4](https://arxiv.org/html/2605.08223#bib.bib20)\]The Gateway is an agent that network participants can deploy into a Kubernetes cluster, which launches computations as pods in this cluster\. Each Gateway, hosting its local data, deploys within its own isolated Virtual Private Cloud \(VPC\)\. The central Apheris Orchestrator, responsible for model parameter collection and aggregation, is also deployed in its own VPC\.
Access to datasets registered to an Apheris Gateway and privacy controls are controlled by asset policies, ensuring that sensitive patient data can remain local while still contributing to global model development\[[9](https://arxiv.org/html/2605.08223#bib.bib19),[8](https://arxiv.org/html/2605.08223#bib.bib23)\]\. An overview of the federated computation and training workflow is shown in Figure[2](https://arxiv.org/html/2605.08223#S2.F2)\.
Figure 2:Federated computation workflow with central aggregation on the orchestrator\.
### II\-BGateway\-side Dataset Setup
The analysis was designed around two synthetic sites, each containing both image and tabular data\. The datasets were registered in each location to an Apheris Gateway, the procedure is detailed in the Apheris documentation\[[2](https://arxiv.org/html/2605.08223#bib.bib5)\]\. The data remains secure, with asset policies granting access only to authorized compute jobs\. Data remains at its original location; it is never downloaded or transmitted\. Each compute gateway is tied to a single organization\. We use a setup with two gateways for the analysis, and to each of them a tabular clinical dataset and an image dataset are registered\. The registration of a dataset to a compute gateway connects the data with the Apheris product\.
### II\-COpen Source Federated Frameworks
Apheris’ federated engine is based on NVFlare and can also integrate with Flower\. Other frameworks such as OpenFL, FATE, or PySyft have not yet been tested with Apheris, but conceptually any server\-based federated engine could be integrated\. Open\-source federated learning frameworks like Kaapana or NVFlare have proven effective for multi\-site hospital collaborations, but they primarily address where computation happens \(keeping data on\-site\) without fine\-grained control over what gets computed\. Apheris adds a computational access governance layer: data custodians define per\-asset policies that restrict which computations may run on their data, ensuring only approved, privacy\-preserving workloads execute\. The Gateway is designed to integrate into existing infrastructure with minimal deployment overhead, making it practical for cross\-organizational collaborations where multiple independent parties need auditable control over how their data is used\.
## IIIRelated Work
Recent advances in multiple sclerosis \(MS\) research have increasingly leveraged federated learning\[[15](https://arxiv.org/html/2605.08223#bib.bib8)\]to enable privacy\-preserving analysis of multi\-center clinical data\. Personalized federated learning approaches have been proposed to improve predictive performance by adapting shared models to local distributions, using techniques such as selective parameter sharing and personalized fine\-tuning\[[20](https://arxiv.org/html/2605.08223#bib.bib15)\]\. Federated learning has also been integrated into broader multi\-layer data pipelines to facilitate large\-scale collaboration and systematic data processing across institutions\[[21](https://arxiv.org/html/2605.08223#bib.bib16)\]\. Complementary approaches in MS data modeling employ Bayesian methods, machine learning techniques, and Common Data Model \(CDM\) based federated learning to harmonize heterogeneous real\-world datasets and enhance predictive modeling\[[23](https://arxiv.org/html/2605.08223#bib.bib17)\]\. In the context of imaging, federated learning has been applied to improve MS lesion segmentation across clinical sites, incorporating noise\-resilient training and label correction to enhance segmentation performance\[[3](https://arxiv.org/html/2605.08223#bib.bib18)\]\. Moreover, explainable federated learning methods have been explored for MS detection and lesion localization, enabling interpretable models that provide insight into both prediction and spatial localization of disease features\[[16](https://arxiv.org/html/2605.08223#bib.bib14)\]\.
In this work, we use artificially simulated data to illustrate how federated image segmentation and federated analysis are already being applied in practice\. In collaboration between Roche and Apheris, important contributions to MS\-induced lesion segmentation\[[10](https://arxiv.org/html/2605.08223#bib.bib2)\]and MS disease progression\[[17](https://arxiv.org/html/2605.08223#bib.bib4)\]have already been realized within the INTONATE\-MS consortium\.
## IVImage analysis
We now present a concrete showcase that demonstrates how these concepts can be applied in practice and begin with the image analysis\. To this end, we fine\-tuned an nnU\-Net\[[12](https://arxiv.org/html/2605.08223#bib.bib6),[11](https://arxiv.org/html/2605.08223#bib.bib12)\]in a federated learning setting using the two imaging datasets described below\.
### IV\-ADataset Description
Themslesseg\[[7](https://arxiv.org/html/2605.08223#bib.bib1)\]dataset contains 115 NIFTI brain MRI scans from 75 patients, with three channels: T1, T2, and FLAIR\. Each scan has 182 slices and an associated segmentation mask marking expert\-annotated brain lesions for every slice\. The 115 scans are split by patient ID into two subsets of 50 and 65 images\. Both datasets were further split into train and test such that we end up with 41 train and 9 test images on site\-1 and 52 train and 13 test images on site\-2\.
### IV\-BFederated Training
To run the federated fine tuning on both image datasets we use Apheris Gateway and specify first a compute spec with the following dataset, model and research configuration
compute\_spec\_id=compute\.create\_from\_args\(
dataset\_ids=dataset\_ids,
model\_id=”apheris\-nnunet”,
model\_version=”0\.28\.0”,
client\_memory=32000,
client\_n\_cpu=14,
client\_n\_gpu=1,
server\_memory=16000,
server\_n\_cpu=7,
\)
Once the compute spec is activated and running via Apheris CLI, jobs can be submitted to the compute spec that trigger the federated training\. A typical training payload is shown below
payload=\{
”mode”:”training”,
”device”:”cuda”,
”num\_rounds”:30,
”model\_configuration”:”2d”,
”dataset\_id”:123
\}
job\.submit\(payload,
compute\_spec\_id=compute\_spec\_id,
verbose=True
\)
Once the training job has finished, a model checkpoint can be downloaded via Apheris CLI and further be used for inference\. An inference job will produce for each image in the inference set an inferred segmentation mask per each slice of the MRI scan\. One slice of the raw data together with ground truth and inferred segmentation mask is shown in Figure[3](https://arxiv.org/html/2605.08223#S4.F3)\.
Figure 3:Single slice of image data point \(bottom row\) together with true and predicted segmentation mask \(top row\)\. The dice score on the selected slice is 0\.85, for the overall image 0\.67\. In the bottom row MRI images on FLAIR, T1 and T2 channel are shown\.The federated model training presented here is intended as a proof\-of\-concept to demonstrate the end\-to\-end workflow rather than to optimize model performance\. Quantitative improvements achieved through federated fine\-tuning within the INTONATE\-MS consortium have been reported in\[[10](https://arxiv.org/html/2605.08223#bib.bib2)\], where the federated nnU\-Net model achieved dice scores ranging from 0\.66 to 0\.80 in the evaluations\. While the image data and ground truth are sensitive and not directly visible to the user, the inferred segmentation mask can either be returned to the user directly, if not considered as sensitive, or persisted on the gateway for further processing and aggregation, depending on the use case and model configuration\.
## VBasic statistics
Figure 4:Boxplots of key variables for both sites with median, quartiles, and range shown\.One piece of information derived from the inferred segmentation mask is the brain lesion volume\. We assume this information is available in our clinical data tables and explore the relation to other clinical measurements\.
We begin with a simple exploratory data analysis, starting with atableonecomputation that provides descriptive statistics for a selection of numerical columns\. The result is shown in Table[II](https://arxiv.org/html/2605.08223#S5.T2)\.
VariablenMeanSDMinQuartile 1MedianQuartile 3MaxSymbol Digit Modalities Test \(SDMT\)138655\.5610\.1527\.0048\.6054\.6063\.0087\.00Annualized Change \(CHG\)13860\.390\.31\-0\.480\.180\.380\.591\.40Expanded Disability Status Scale \(EDSS\)13863\.512\.160\.031\.712\.974\.9314\.03Relapse Count \(RELAPSE\)13863\.111\.820\.001\.982\.973\.969\.00Censoring Indicator \(CNSR\)13860\.010\.110\.000\.010\.010\.011\.00Multiple Sclerosis Functional Composite \(MSFC\)1386\-2\.332\.02\-9\.33\-3\.72\-2\.32\-1\.076\.26Timed 25\-Foot Walk Test \(T25FWT\)13868\.912\.532\.006\.968\.8810\.9618\.00Nine\-Hole Peg Test \(9HPT\)138622\.669\.770\.0015\.9621\.6628\.5057\.00Brain Lesion Volume13862188213614141034372210338Confirmed Disability Accumulation \(CDA\)13860\.00\.20\.00\.00\.00\.01\.0TABLE II:Thetableonefunction provides a summary of all key variables\. The values were rounded to two decimal places depending on their magnitude, and in the case of Brain Lesion Volume, to whole numbers\.We apply this analysis both in a federated manner to the entire collection of datasets and separately to each site in order to identify potential differences in data distributions or magnitudes\. The result is shown in Figure[4](https://arxiv.org/html/2605.08223#S5.F4)as boxplots\. The comparison of both sites shows a small deviation in MSFC and T25FT distributions and a larger one for EDSS and Brain Lesion Volume\.
To get a clearer picture of how the different features in the data are connected, we take the correlation matrix into investigation\. Here, a separated analysis of both sites is compared to the federated analysis on the statistical ensemble\.
The correlation matrices computed at each site independently show a reasonably correlated block of the functional outcome measures of MS and apparently a negative correlation between EDSS and the volume of brain lesions \(cf\. Figure[5](https://arxiv.org/html/2605.08223#S5.F5)\)\.
Figure 5:Correlation matrices for each site, computed independently, show negative correlations between brain lesion volume and EDSS\.The interpretation of this negative correlation is not immediately clear, so for context we consult the federated correlation matrix\. The federated analysis shows a markedly different relationship between the two variables compared to the individual dataset results\. As seen in Figure[6](https://arxiv.org/html/2605.08223#S5.F6)the two features exhibit a noticeable positive correlation\.
Figure 6:The federated correlation matrix based on the full data shows a positive correlation between brain lesion volume and EDSS\.This analysis across sites reveals that the negative correlations observed in the individual subsets may merely be an example of Simpson’s Paradox\[[22](https://arxiv.org/html/2605.08223#bib.bib21)\], where correlations seen in different groups reverse when these groups are combined\. This can be confirmed by examining the underlying data in an aggregated scatter plot\. This is an intuitive way to compare the relationships observed in each site and the overall relationship seen across sites\.
Figure 7:Scatter plot of discretized EDSS score vs\. Brain Lesion Volume\. Individual site results are shown in red/blue and exhibit a negative correlation, while aggregated results in purple show a positive correlation\.Because privacy constraints prevent us from inspecting individual data points, we define a box discretization over the dimensions Brain Lesion Volume and EDSS, and perform a federated computation to count how many data points fall into each box\. These aggregated points can then be visualized with dot size depending on the number of observations in each box to enable us to clearly see in Figure[7](https://arxiv.org/html/2605.08223#S5.F7)how the differing correlations arise\. Note that care must be taken when choosing the binning \(box size\), as this may affect the visual perception of the correlation\.
## VISurvival Analysis
The previous section showcased how federated image models can help to detect brain lesions and how to assess their impact on other clinical measurements with the help of federated statistics\. To analyze the disease progression over time, this section focuses on survival analysis\.
The first step is to prepare the date variables\. Apheris preprocessing converts standard date formats into datetime objects that are easy to work with\. Atableonesummary of the date variables is shown in Table[III](https://arxiv.org/html/2605.08223#S6.T3)\.
TABLE III:tableoneresults on date columns### VI\-AKaplan\-Meier Plots
Datetime objects allow the calculation of date differences\. The days since diagnosis are derived from the difference of Visit Date and Diagnosis Date and then used to compute a Kaplan\-Meier survival function\[[14](https://arxiv.org/html/2605.08223#bib.bib9)\]\. In this instance, we are not analyzing patient mortality, but rather the probability of developing an MS\-induced disability, as defined by worsening EDSS\. We define the eventEDSS\>2\.0EDSS\>2\.0and the Kaplan\-Meier curve in Figure[8](https://arxiv.org/html/2605.08223#S6.F8)shows the ”survival” probability of not worsening beyond this threshold\. For this example, we defined the clinical event asEDSS\>2\.0EDSS\>2\.0, representing the transition from subclinical neurological signs to measurable minimal disability\. This threshold was selected to maximize the sensitivity of the survival analysis in detecting early disease progression and to provide a sufficient number of events for correlation with imaging biomarkers, which typically manifest early in the disease course\. In a real medical application, clinically established milestones such asEDSS\>3\.0EDSS\>3\.0or4\.04\.0should be used instead\.
Figure 8:Kaplan\-Meier plot for the event of EDSS exceeding 2 with time intervals used to preserve privacy\.
### VI\-BCox Proportional Hazards Models
We can now analyze individual patient data in a privacy\-preserving way while simultaneously modeling disease progression at the population level\. To unify these two perspectives, we conclude this case study with a Cox model\[[5](https://arxiv.org/html/2605.08223#bib.bib10)\], which combines a baseline hazard functionH0\(t\)H\_\{0\}\(t\)with a covariate\-dependent componentexp\(βX\)exp\(\\beta X\), thus incorporating individual patient features as explanatory variables in the analysis\. In the following we use the federated implementation of Cox regression as proposed by Andreux et al\.\[[1](https://arxiv.org/html/2605.08223#bib.bib11)\]\.
Together, these two components constitute the Cox proportional hazard function:
H\(t\|X\)=H0\(t\)⋅exp\(βX\)\.H\(t\|X\)=H\_\{0\}\(t\)\\cdot exp\(\\beta X\)\.\(1\)
And from that we can derive the survival function as
S\(t\|X\)=exp\(−H\(t\|X\)\)S\(t\|X\)=exp\\left\(\-H\(t\|X\)\\right\)\(2\)
Figure 9:Survival function derived from Cox baseline hazard\.The covariate coefficients in the Cox model quantify the influence of each feature on the hazard\. In Table[IV](https://arxiv.org/html/2605.08223#S6.T4), we present these coefficients and Figure[9](https://arxiv.org/html/2605.08223#S6.F9)shows the survival function derived from the baseline hazard\. The coefficients normalized by the feature means can be interpreted as an indicator of feature importance\.
TABLE IV:Coefficients of a Cox proportional hazards
model for the event of EDSS exceeding 2,
normalized by mean value\.#### VI\-B1Cox Regression with Categorical Features
Categorical variables can also be incorporated into the Cox proportional hazards model\. To this end, an appropriate encoding is required\. In our federated analysis, we first apply one\-hot encoding using the Apheris preprocessing framework and then reduce the resulting high\-dimensional feature space via principal component analysis \(PCA\)\[[13](https://arxiv.org/html/2605.08223#bib.bib13)\]\.
As demonstrated earlier in this case study, Apheris Statistics enables federated computation of the covariance matrix, from which the transformation vectors for the PCA are derived\. Figure[10](https://arxiv.org/html/2605.08223#S6.F10)illustrates the mapping from the one\-hot encoded feature space to four principal components\.
Figure 10:PCA transformation for one\-hot encoded categoricals\.
## VIIConclusion
Federated learning offers a powerful paradigm shift for medical research\. By enabling models to learn from distributed data sources without ever exposing sensitive patient information, it overcomes one of the largest barriers to collaborative healthcare innovation\. As demonstrated, this approach allows researchers to uncover deeper insights from the full statistical ensemble rather than depending on fragmented, isolated data subsets\. The result is stronger evidence, more robust models, and findings that better reflect real\-world patient populations\.
With purpose\-built tooling for image segmentation, statistical modeling, and survival analysis, Apheris provides the technical foundation needed to drive this new generation of research\. By combining privacy\-preserving infrastructure with domain\-specific analytics, Apheris empowers clinical researchers, data scientists, and pharma institutions to accelerate discovery\. The code for the showcase presented here is available as a notebook at https://www\.github\.com/apheris/simulated\-brain\-lesion\-analysis\.
## References
- \[1\]M\. Andreux, A\. Manoel, R\. Menuet, C\. Saillard, and C\. Simpson\(2020\)Federated survival analysis with discrete‐time cox models\.Note:International Workshop on Federated Learning for User Privacy and Data Confidentiality in conjunction with ICML 2020arXiv:2006\.08997External Links:[Link](https://arxiv.org/abs/2006.08997)Cited by:[§VI\-B](https://arxiv.org/html/2605.08223#S6.SS2.p1.2)\.
- \[2\]ApherisRegister & unregister datasets\.Note:https://www\.apheris\.com/docs/gateway/latest/data\-custodian/register\-and\-unregister\-datasets\.htmlAccessed: 2025\-12\-04Cited by:[§II\-B](https://arxiv.org/html/2605.08223#S2.SS2.p1.1)\.
- \[3\]L\. Bai, D\. Wang, H\. Wang, M\. Barnett, M\. Cabezas, W\. Cai, F\. Calamante, K\. Kyle, D\. Liu, L\. Ly, A\. Nguyen, C\. Shieh, R\. Sullivan, G\. Zhan, W\. Ouyang, and C\. Wang\(2024\)Improving multiple sclerosis lesion segmentation across clinical sites: a federated learning approach with noise\-resilient training\.Artificial Intelligence in Medicine152,pp\. 102872\.External Links:ISSN 0933\-3657,[Document](https://dx.doi.org/https%3A//doi.org/10.1016/j.artmed.2024.102872),[Link](https://www.sciencedirect.com/science/article/pii/S0933365724001143)Cited by:[§III](https://arxiv.org/html/2605.08223#S3.p1.1)\.
- \[4\]O\. Choudhury, E\. Trautmann, I\. Hales, J\. Prieto, and U\. Ratan\(2025\-08\-11\)Federated learning\-based protein language models with apheris on aws\.Note:AWS for Industries blogAccessed 2025External Links:[Link](https://aws.amazon.com/blogs/industries/federated-learning-based-protein-language-models-with-apheris-on-aws/)Cited by:[Figure 1](https://arxiv.org/html/2605.08223#S2.F1)\.
- \[5\]D\. R\. Cox\(1972\)Regression models and life\-tables\.Journal of the Royal Statistical Society: Series B \(Methodological\)34\(2\),pp\. 187–220\.Cited by:[§VI\-B](https://arxiv.org/html/2605.08223#S6.SS2.p1.2)\.
- \[6\]A\. R\. Elkordy, Y\. H\. Ezzeldin, S\. Han, S\. Sharma, C\. He, S\. Mehrotra, and S\. Avestimehr\(2023\)Federated analytics: a survey\.APSIPA Transactions on Signal and Information Processing12\(1\),pp\. e4\.External Links:[Document](https://dx.doi.org/10.1561/116.00000063)Cited by:[§II](https://arxiv.org/html/2605.08223#S2.p1.1)\.
- \[7\]F\. Guarnera, A\. Rondinella, E\. Crispino, G\. Russo, C\. D\. Lorenzo, D\. Maimone, F\. Pappalardo, and S\. Battiato\(2025\-06\)MSLesSeg: baseline and benchmarking of a new Multiple Sclerosis Lesion Segmentation dataset\.External Links:[Link](https://springernature.figshare.com/articles/dataset/MSLesSeg_baseline_and_benchmarking_of_a_new_Multiple_Sclerosis_Lesion_Segmentation_dataset/27919209),[Document](https://dx.doi.org/10.6084/m9.figshare.27919209.v1)Cited by:[§I](https://arxiv.org/html/2605.08223#S1.p3.1),[§IV\-A](https://arxiv.org/html/2605.08223#S4.SS1.p1.1)\.
- \[8\]I\. Hagestedt, I\. Hales, E\. Boernert, H\. R\. Roth, M\. A\. Hoeh, R\. Röhm, E\. Dobson, and J\. T\. Prieto\(2024\)Toward a tipping point in federated learning in healthcare and life sciences\.Patterns5\(11\),pp\. 101077\.External Links:ISSN 2666\-3899,[Document](https://dx.doi.org/https%3A//doi.org/10.1016/j.patter.2024.101077),[Link](https://www.sciencedirect.com/science/article/pii/S2666389924002368)Cited by:[§II\-A](https://arxiv.org/html/2605.08223#S2.SS1.p3.1)\.
- \[9\]I\. Hagestedt\(2025\-April 24\)What is a federated data network and how does it support cross‑institutional research?\.Note:Apheris blogPublished December 3, 2024, last updated April 24, 2025External Links:[Link](https://www.apheris.com/resources/blog/federated-data-networks)Cited by:[§II\-A](https://arxiv.org/html/2605.08223#S2.SS1.p3.1)\.
- \[10\]S\. Hindawi, B\. Szubstarski, E\. Boernert, B\. Tackenberg, and J\. Wuerfel\(2025\)Federated learning for lesion segmentation in multiple sclerosis: a real\-world multi\-center feasibility study\.Frontiers in NeurologyVolume 16 \- 2025\.External Links:[Link](https://www.frontiersin.org/journals/neurology/articles/10.3389/fneur.2025.1620469),[Document](https://dx.doi.org/10.3389/fneur.2025.1620469),ISSN 1664\-2295Cited by:[§I](https://arxiv.org/html/2605.08223#S1.p1.1),[§III](https://arxiv.org/html/2605.08223#S3.p2.1),[§IV\-B](https://arxiv.org/html/2605.08223#S4.SS2.p5.1)\.
- \[11\]F\. Isensee, P\. F\. Jaeger, S\. A\. A\. Kohl, J\. Petersen, and K\. H\. Maier\-Hein\(2021\)NnU\-net: a self\-configuring method for deep learning\-based biomedical image segmentation\.Nature Methods18\(2\),pp\. 203–211\.External Links:[Document](https://dx.doi.org/10.1038/s41592-020-01008-z),[Link](https://doi.org/10.1038/s41592-020-01008-z),ISSN 1548\-7105Cited by:[§IV](https://arxiv.org/html/2605.08223#S4.p1.1)\.
- \[12\]F\. Isensee, J\. Petersen, A\. Klein, D\. Zimmerer, P\. F\. Jaeger, S\. Kohl, J\. Wasserthal, G\. Koehler, T\. Norajitra, S\. Wirkert, and K\. H\. Maier\-Hein\(2018\)NnU\-net: self\-adapting framework for u\-net\-based medical image segmentation\.External Links:1809\.10486,[Link](https://arxiv.org/abs/1809.10486)Cited by:[§IV](https://arxiv.org/html/2605.08223#S4.p1.1)\.
- \[13\]I\. T\. Jolliffe\(2002\)Principal component analysis\.2nd edition,Springer Series in Statistics,Springer\-Verlag\.External Links:ISBN 978\-0\-387\-95442\-4,[Document](https://dx.doi.org/10.1007/b98835)Cited by:[§VI\-B1](https://arxiv.org/html/2605.08223#S6.SS2.SSS1.p1.1)\.
- \[14\]E\. L\. Kaplan and P\. Meier\(1958\)Individual nonparametric estimation from incomplete observations\.Journal of the American Statistical Association53\(282\),pp\. 457–481\.External Links:[Document](https://dx.doi.org/10.1080/01621459.1958.10501452)Cited by:[§VI\-A](https://arxiv.org/html/2605.08223#S6.SS1.p1.4)\.
- \[15\]B\. McMahan, E\. Moore, D\. Ramage, S\. Hampson, and B\. A\. y Arcas\(2017\)Communication\-efficient learning of deep networks from decentralized data\.InProceedings of the 20th International Conference on Artificial Intelligence and Statistics \(AISTATS\),pp\. 1273–1282\.Cited by:[§III](https://arxiv.org/html/2605.08223#S3.p1.1)\.
- \[16\]F\. Niro, M\. Di Renzo, P\. Agnello, M\. Petyx, G\. Ciaramella, F\. Martinelli, M\. Cesarelli, A\. Santone, and F\. Mercaldo\(2026\)A privacy\-preserving method for explainable multiple sclerosis detection through federated machine learning\.InImage Analysis and Processing \- ICIAP 2025 Workshops,E\. Rodolà, F\. Galasso, and I\. Masi \(Eds\.\),Cham,pp\. 29–40\.External Links:ISBN 978\-3\-032\-11381\-8Cited by:[§III](https://arxiv.org/html/2605.08223#S3.p1.1)\.
- \[17\]J\. Oh, J\. Smolders, F\. Buijs, R\. Pedotti, F\. Dahlke, L\. Kaczmarek, A\. Kemmisetti, V\. Sharma, D\. Heinzmann, B\. Tackenberg, A\. Bar\-Or, and H\. Wiendl\(2023\-10\)Utility and implementation of a federated research infrastructure to assess lack of disease stability as a real\-world surrogate of PIRA, by combining MS clinical trial and real\-world cohort data \(the INTONATE MS consortium\)\.Roche and Genentech\.Note:https://medically\.gene\.com/global/en/unrestricted/neuroscience/ECTRIMS\-2023/ectrims\-2023\-poster\-oh\-utility\-and\-implementation\.htmlECTRIMS 2023 Conference PosterCited by:[§I](https://arxiv.org/html/2605.08223#S1.p1.1),[§III](https://arxiv.org/html/2605.08223#S3.p2.1)\.
- \[18\]J\. Oh, J\. Smolders, F\. Buijs, J\. Federer\-Gsponer, G\. M\. zu Hörste, M\. Mamdani, C\. Perrone, D\. L\. Mowery, T\. Kühnel, K\. van Tulder, C\. Testa, M\. C\. Elze, R\. Pedotti, L\. Kaczmarek, V\. Sharma, A\. Tackenberg, and H\. Wiendl\(2025\)Integrating multicentre data to explore rwpirma: results from the intonate\-ms consortium\.InECTRIMS 2025,Barcelona, Spain\.Note:Conference poster, INTONATE\-MS consortiumCited by:[§I](https://arxiv.org/html/2605.08223#S1.p1.1)\.
- \[19\]J\. Oh, J\. Smolders, F\. Buijs, R\. Pedotti, F\. Dahlke, L\. Kaczmarek, A\. Kemmisetti, V\. Sharma, D\. Heinzmann, B\. Tackenberg,et al\.\(2023\)Utility and implementation of a federated research infrastructure to assess lack of disease stability as a real\-world surrogate of pira, by combining ms clinical trial and real\-world cohort data \(the intonate\-ms consortium\)\.Multiple Sclerosis Journal29\.Cited by:[§I](https://arxiv.org/html/2605.08223#S1.p1.1)\.
- \[20\]A\. Pirmani, E\. De Brouwer, A\. Arany, M\. Oldenhof, A\. Passemiers, A\. Faes, T\. Kalincik, S\. Ozakbas, R\. Gouider, B\. Willekens, D\. Horakova, E\. K\. Havrdova, F\. Patti, A\. Prat, A\. Lugaresi, V\. Tomassini, P\. Grammond, E\. Cartechini, I\. Roos, C\. Boz, R\. Alroughani, M\. P\. Amato, K\. Buzzard, J\. Lechner\-Scott, J\. Guimarães, C\. Solaro, O\. Gerlach, A\. Soysal, J\. Kuhle, J\. L\. Sanchez\-Menoyo, D\. Spitaleri, T\. Csepany, B\. Van Wijmeersch, R\. Ampapa, J\. Prevost, S\. J\. Khoury, V\. Van Pesch, N\. John, D\. Maimone, B\. Weinstock\-Guttman, G\. Laureys, P\. McCombe, Y\. Blanco, A\. Altintas, A\. Al\-Asmi, J\. Garber, A\. Van der Walt, H\. Butzkueven, K\. de Gans, C\. Rozsa, B\. Taylor, T\. Al\-Harbi, A\. Sas, C\. Rajda, O\. Gray, D\. Decoo, W\. M\. Carroll, A\. G\. Kermode, M\. Fabis\-Pedrini, D\. Mason, A\. Perez\-Sempere, M\. Simu, N\. Shuey, B\. Singhal, M\. Cauchi, T\. A\. Hardy, S\. Ramanathan, P\. Lalive, C\. Sirbu, S\. Hughes, T\. Castillo Trivino, L\. M\. Peeters, and Y\. Moreau\(2025\)Personalized federated learning for predicting disability progression in multiple sclerosis using real\-world routine clinical data\.npj Digital Medicine8\(1\),pp\. 478\.External Links:[Document](https://dx.doi.org/10.1038/s41746-025-01788-8),[Link](https://doi.org/10.1038/s41746-025-01788-8),ISSN 2398\-6352Cited by:[§III](https://arxiv.org/html/2605.08223#S3.p1.1)\.
- \[21\]A\. Pirmani, E\. De Brouwer, L\. Geys, T\. Parciak, Y\. Moreau, and L\. Peeters\(2023\)The journey of data within a global data sharing initiative: a federated 3\-layer data analysis pipeline to scale up multiple sclerosis research\.JMIR Medical Informatics11,pp\. e48030\.External Links:[Document](https://dx.doi.org/10.2196/48030),[Link](https://medinform.jmir.org/2023/1/e48030)Cited by:[§III](https://arxiv.org/html/2605.08223#S3.p1.1)\.
- \[22\]E\. H\. Simpson\(1951\)The interpretation of interaction in contingency tables\.Journal of the Royal Statistical Society, Series B13\(2\),pp\. 238–241\.External Links:[Document](https://dx.doi.org/10.1111/j.2517-6161.1951.tb00088.x)Cited by:[§V](https://arxiv.org/html/2605.08223#S5.p7.1)\.
- \[23\]M\. Trojano, P\. Iaffaldano, M\. Copetti, J\. Drahota, L\. Forsberg, E\. F\. Mouresan, L\. Pontieri, T\. Spelman, N\. Toschi, H\. Butzkueven, A\. Glaser, J\. Hillert, D\. Horakova, M\. Magyari, S\. Vukusic, G\. Lucisano, and T\. Kalincik\(2025\)Big multiple sclerosis data network: novel modelling approaches for real\-world data analysis\.Journal of Neurology272\(12\),pp\. 754\.External Links:[Document](https://dx.doi.org/10.1007/s00415-025-13439-9)Cited by:[§III](https://arxiv.org/html/2605.08223#S3.p1.1)\.Similar Articles
Federated Survival Analysis in Healthcare: A Multi-Model Evaluation on Cross-Institutional Heterogeneous Breast Cancer Data
This paper systematically evaluates three survival models (Cox, DeepSurv, RSF) under federated learning on heterogeneous breast cancer data, finding that FL outperforms local training and RSF offers the best balance of performance across clients.
Federated Learning
The article explains the concept of Federated Learning as a privacy-preserving machine learning technique that trains models on local devices rather than central servers. It details the process of encrypted parameter updates and aggregation to mitigate data leakage risks while maintaining model performance.
FederatedRSF : Federated Random Survival Forests for Partially Overlapping Medical Data
This paper presents FederatedRSF, a Python package for federated random survival forests that handles partially overlapping medical data across institutions without sharing raw data, and demonstrates comparable performance to centralized training on breast cancer data.
Functional MRI Time Series Generation via Wavelet-Based Image Transform and Spectral Flow Matching for Brain Disorder Identification
This paper proposes DSFM, a novel generative framework that uses wavelet decomposition and spectral flow matching to synthesize realistic fMRI time series for brain disorder identification, addressing data scarcity and non-stationarity challenges.
Learning Robust and Task-Invariant Functional Representation from fMRI through Siamese Self-Supervised Learning
This paper introduces BrainSimSiam, a lightweight self-supervised framework using siamese networks to learn robust fMRI representations from positive-only pairs, achieving strong performance on downstream tasks even with limited data.