2018
DOI: 10.1101/395947
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

MultiPLIER: a transfer learning framework for transcriptomics reveals systemic features of rare disease

Abstract: SUMMARYUnsupervised machine learning methods provide a promising means to analyze and interpret large datasets. However, most datasets generated by individual researchers remain too small to fully benefit from these methods. In the case of rare diseases, there may be too few cases available, even when multiple studies are combined. We sought to determine whether or not machine learning models could be constructed from large public data compendia and then transferred to small datasets for subsequent analysis. W… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
68
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
2
1

Relationship

3
4

Authors

Journals

citations
Cited by 46 publications
(69 citation statements)
references
References 61 publications
1
68
0
Order By: Relevance
“…We are interested to see the variety of unsupervised dimensionality reduction techniques that can be repurposed in this supervised setting. For instance, we may want to estimate components which correspond to genetic pathways, similarly to (27,28). This can be accomplished under the differential PCA framework by using a Bayesian PCA method with a prior that links genes according to pathway annotations.…”
Section: Discussionmentioning
confidence: 99%
“…We are interested to see the variety of unsupervised dimensionality reduction techniques that can be repurposed in this supervised setting. For instance, we may want to estimate components which correspond to genetic pathways, similarly to (27,28). This can be accomplished under the differential PCA framework by using a Bayesian PCA method with a prior that links genes according to pathway annotations.…”
Section: Discussionmentioning
confidence: 99%
“…We retrieved a model [41] that was previously trained on the recount2 RNA-seq dataset [25,42] and then used it to assess the expression of latent variables in the pan-NF dataset. We retrieved code for this analysis from the public repository for MultiPLIER [27] (https://github.com/greenelab/multi-plier). To project the NF data into the MultiPLIER model, we used the GetNewDataB function.…”
Section: Latent Variable Calculation and Selectionmentioning
confidence: 99%
“…We analyzed transcriptomic data from NF in the context of latent variables from MultiPLIER, a machine learning resource designed to aid in rare disease analyses [27] . Raw transcriptomic data from NF were retrieved from the NF Data Portal, reprocessed, and stored in Synapse as described above.…”
Section: Latent Variable Calculation and Selectionmentioning
confidence: 99%
See 2 more Smart Citations