2021
DOI: 10.1186/s13059-021-02313-2
|View full text |Cite
|
Sign up to set email alerts
|

Schema: metric learning enables interpretable synthesis of heterogeneous single-cell modalities

Abstract: A complete understanding of biological processes requires synthesizing information across heterogeneous modalities, such as age, disease status, or gene expression. Technological advances in single-cell profiling have enabled researchers to assay multiple modalities simultaneously. We present Schema, which uses a principled metric learning strategy that identifies informative features in a modality to synthesize disparate modalities into a single coherent interpretation. We use Schema to infer cell types by in… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
43
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 32 publications
(43 citation statements)
references
References 55 publications
0
43
0
Order By: Relevance
“…Recent algorithms for the analysis of multi-modal data were developed to process paired datasets, in which both modalities have been profiled at the same cell [8, 7]. These algorithms handle multi-modal data, but lack the ability to integrate single modality datasets into the same analysis.…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…Recent algorithms for the analysis of multi-modal data were developed to process paired datasets, in which both modalities have been profiled at the same cell [8, 7]. These algorithms handle multi-modal data, but lack the ability to integrate single modality datasets into the same analysis.…”
Section: Discussionmentioning
confidence: 99%
“…Given such a dataset (which are much more common than multi-modal datasets), use of multi-modal data can enable an inference of the “missing” modality and thus reach new conclusions about the diversity and regulation of cell states in a wide array of tissues, cell types and experimental or clinical settings. While some computational methods have emerged that can analyze multi-modal data in isolation [7, 8], the joint analysis of multi-modal and single-modality data necessitates the development of novel computational methods. These methods must be capable of leveraging the power of multi-modal data while accounting for the general caveats of single cell genomics data, most prominently - batch effects, limited sensitivity and noise, and taking into consideration the unique statistical properties of each modality (i.e., quantitative signal for scRNA-seq and a largely binary signal for scATAC-seq).…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…The advent of single-cell multimodal omics has posed additional computational challenges in extrapolating useful information from the different layers measured for each cell. This has motivated the introduction of harmonization methods for performing a joint analysis, such as clustering, to exploit the power given by all available cell modalities ( Argelaguet et al, 2020 ; Wang et al, 2020 ; Gayoso et al, 2021 ; Hao et al, 2021 ; Singh et al, 2021 ; Zuo and Chen, 2021 ). It is worth to remark that DL approaches (e.g., deep generative models) represent a significant part of harmonization methods for single-cell datasets.…”
Section: Computational Approaches To Analyze Single-cell Data Of the Tmementioning
confidence: 99%
“…These follow two main frameworks: metric learning and latent variable learning. Weighted nearest neighbors (WNN) ( Hao et al , 2021 )) and Schema ( Singh et al , 2021 ) explore, respectively, nearest neighbors and quadratic programming to estimate a single distance matrix representing the integrated multimodal data. Both approaches explore efficient algorithms, but do not explicitly provide models associating molecular features to the ‘latent space’.…”
Section: Introductionmentioning
confidence: 99%