SCIM: Universal Single-Cell Matching with Unpaired Feature Sets

Stark, Stefan G.; Ficek, Joanna; Locatello, Francesco; Bonilla, Ximena; Chevrier, Stéphane; Singer, Franziska; Rätsch, Gunnar; Lehmann, Kjong-Van

doi:10.1101/2020.06.11.146845

Cited by 6 publications

(4 citation statements)

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For each time point and line separately, we performed Canonical Correlation Analysis (CCA) on gene activities and gene expression data using the Seurat function based on 2000 features, which were selected using the Seurat function . In CCA space, we performed minimum-cost maximum-flow (MCMF) bipartite matching between the modalities as described (20) (https://github.com/ratschlab/scim). The function was used with , , and otherwise default parameters.…”

Section: Methodsmentioning

confidence: 99%

See 1 more Smart Citation

Inferring and perturbing cell fate regulomes in human cerebral organoids

Fleck

Jansen

Wollny

et al. 2021

Preprint

View full text Add to dashboard Cite

Self-organizing cerebral organoids grown from pluripotent stem cells combined with single-cell genomic technologies provide opportunities to explore gene regulatory networks (GRNs) underlying human brain development. Here we acquire single-cell transcriptome and accessible chromatin profiling data over a dense time course covering multiple phases of organoid development including neuroepithelial formation, patterning, brain regionalization, and neurogenesis. We identify temporally dynamic and brain region-specific regulatory regions, and cell interaction analysis reveals emergent patterning centers associated with regionalization. We develop Pando, a flexible linear model-based framework that incorporates multi-omic data and transcription binding site predictions to infer a global GRN describing organoid development. We use pooled genetic perturbation with single-cell transcriptome readout to assess transcription factor requirement for cell fate and state regulation in organoid. We find that certain factors regulate the abundance of cell fates, whereas other factors impact neuronal cell states after differentiation. We show that the zinc finger protein GLI3 is required for cortical fate establishment in humans, recapitulating previous work performed in mammalian model systems. We measure transcriptome and chromatin accessibility in normal or GLI3-perturbed cells and identify a regulome central to the dorsoventral telencephalic fate decision. This regulome suggests that Notch effectors HES4/5 are direct GLI3 targets, which together coordinate cortex and ganglionic eminence diversification. Altogether, we provide a framework for how multi-brain region model systems and single-cell technologies can be leveraged to reconstruct human brain developmental biology.

show abstract

Section: Methodsmentioning

confidence: 99%

“…S1, B to D). We constructed 'multi-omic metacells' containing information on both transcriptome and chromatin accessibility using minimum-cost, maximum-flow (MCMF) bipartite matching (20) within the CCA space (Fig. 1A).…”

Section: Single-cell Multiomic Reconstruction Of Cerebral Organoid Developmentmentioning

confidence: 99%

Inferring and perturbing cell fate regulomes in human cerebral organoids

Fleck

Jansen

Wollny

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…While features sets might not be easily comparable, the pathway structure underlying these feature sets are shared. This indicates that pathway-factorized latent representations, like those learned by pmVAE, could be used more easily integrate [5,9,28,43] these technologies.…”

Section: Discussionmentioning

confidence: 99%

pmVAE: Learning Interpretable Single-Cell Representations with Pathway Modules

Gut

Stark

Raetsch

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

MotivationDeep learning techniques have yielded tremendous progress in the field of computational biology over the last decade, however many of these techniques are opaque to the user. To provide interpretable results, methods have incorporated biological priors directly into the learning task; one such biological prior is pathway structure. While pathways represent most biological processes in the cell, the high level of correlation and hierarchical structure make it complicated to determine an appropriate computational representation.ResultsHere, we present pathway module Variational Autoencoder (pmVAE). Our method encodes pathway information by restricting the structure of our VAE to mirror gene-pathway memberships. Its architecture is composed of a set of subnetworks, which we refer to as pathway modules. The subnetworks learn interpretable latent representations by factorizing the latent space according to pathway gene sets. We directly address correlation between pathways by balancing a module-specific local loss and a global reconstruction loss. Furthermore, since many pathways are by nature hierarchical and therefore the product of multiple downstream signals, we model each pathway as a multidimensional vector. Due to their factorization over pathways, the representations allow for easy and interpretable analysis of multiple downstream effects, such as cell type and biological stimulus, within the contexts of each pathway. We compare pmVAE against two other state-of-the-art methods on two single-cell RNA-seq case-control data sets, demonstrating that our pathway representations are both more discriminative and consistent in detecting pathways targeted by a perturbation.Availability and implementationhttps://github.com/ratschlab/pmvae

show abstract

“…Apart from the feature-converted data, Seurat v3 15 and bindSC 30 also devised heuristic strategies to utilize information in the original feature space, which probably explains their improved performance than methods that do not 16, 17 . At the cell level, known cell types have also been used via (semi-)supervised learning 47, 48 , but this approach incurs substantial limitations in terms of applicability since such supervision is typically unavailable and in many cases serves as the purpose of multi-omics integration per se 26 . Notably, one of these methods was proposed with a similar autoencoder architecture and adversarial alignment 48 , but it relied on matched cell types or clusters to orient the alignment.…”

Section: Discussionmentioning

confidence: 99%

Multi-omics integration and regulatory inference for unpaired single-cell data with a graph-linked unified embedding framework

Cao

Gao

2021

Preprint

View full text Add to dashboard Cite

With the ever-increasing amount of single-cell multi-omics data accumulated during the past years, effective and efficient computational integration is becoming a serious challenge. One major obstacle of unpaired multi-omics integration is the feature discrepancies among omics layers. Here, we propose a computational framework called GLUE (graph-linked unified embedding), which utilizes accessible prior knowledge about regulatory interactions to bridge the gaps between feature spaces. Systematic benchmarks demonstrated that GLUE is accurate, robust and scalable. We further employed GLUE for various challenging tasks, including triple-omics integration, model-based regulatory inference and multi-omics human cell atlas construction (over millions of cells) and found that GLUE achieved superior performance for each task. As a generalizable framework, GLUE features a modular design that can be flexibly extended and enhanced for new analysis tasks. The full package is available online at https://github.com/gao-lab/GLUE for the community.

show abstract

SCIM: Universal Single-Cell Matching with Unpaired Feature Sets

Cited by 6 publications

References 29 publications

Inferring and perturbing cell fate regulomes in human cerebral organoids

Inferring and perturbing cell fate regulomes in human cerebral organoids

pmVAE: Learning Interpretable Single-Cell Representations with Pathway Modules

Multi-omics integration and regulatory inference for unpaired single-cell data with a graph-linked unified embedding framework

Contact Info

Product

Resources

About