Anne Hartebrodt scite author profile

Motivation Federated learning enables privacy preserving machine learning in the medical domain because the sensitive patient data remains with the owner and only parameters are exchanged between the data holders. The federated scenario introduces specific challenges related to the decentralized nature of the data, such as batch effects and differences in study population between the sites. Here, we investigate the challenges of moving classical analysis methods to the federated domain, specifically principal component analysis, a versatile and widely used tool, often serving as an initial step in machine learning and visualization workflows. We provide implementations of different federated PCA algorithms and evaluate them regarding their accuracy for high dimensional biological data using realistic sample distributions over multiple data sites, and their ability to preserve downstream analyses. Results Federated subspace iteration converges to the centralized solution even for unfavorable data distributions, while approximate methods introduce error. Larger sample sizes at the study sites lead to better accuracy of the approximate methods. Approximate methods may be sufficient for coarse data visualization, but are vulnerable to outliers and batch effects. Before the analysis, the PCA algorithm, as well as the number of eigenvectors should be considered carefully to avoid unnecessary communication overhead. Availability Simulation code and notebooks for federated PCA can be found at https://gitlab.com/roettgerlab/federatedPCA; the code for the federated app is available at https://github.com/AnneHartebrodt/fc-federated-pca Supplementary information Supplementary data are available at Bioinformatics Advances online.

show abstract

De Novo Pathway Enrichment with KeyPathwayMiner

Alcaraz

Hartebrodt

List

2019

View full text Add to dashboard Cite

Federated singular value decomposition for high dimensional data

Hartebrodt¹,

Röttger²,

Blumenthal³

2022

Preprint

View full text Add to dashboard Cite

Federated learning (FL) is emerging as a privacy-aware alternative to classical cloud-based machine learning. In FL, the sensitive data remains in data silos and only aggregated parameters are exchanged. Hospitals and research institutions which are not willing to share their data can join a federated study without breaching confidentiality. In addition to the extreme sensitivity of biomedical data, the high dimensionality poses a challenge in the context of federated genome-wide association studies (GWAS). In this article, we present a federated singular value decomposition (SVD) algorithm, suitable for the privacy-related and computational requirements of GWAS. Notably, the algorithm has a transmission cost independent of the number of samples and is only weakly dependent on the number of features, because the singular vectors associated with the samples are never exchanged and the vectors associated with the features only for a fixed number of iterations. Although motivated by GWAS, the algorithm is generically applicable for both horizontally and vertically partitioned data.

show abstract

The FeatureCloud AI Store for Federated Learning in Biomedicine and Beyond

Matschinske¹,

Späth²,

Nasirigerdeh³

et al. 2021

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Anne Hartebrodt

Federated Principal Component Analysis for Genome-Wide Association Studies

Federated horizontally partitioned principal component analysis for biomedical applications

De Novo Pathway Enrichment with KeyPathwayMiner

Federated singular value decomposition for high dimensional data

The FeatureCloud AI Store for Federated Learning in Biomedicine and Beyond

Contact Info

Product

Resources

About