2019
DOI: 10.1101/631382
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

SISUA: Semi-Supervised Generative Autoencoder for Single Cell Data

Abstract: Single-cell transcriptomics offers a tool to study the diversity of cell phenotypes through snapshots of the abundance of mRNA in individual cells. Often there is additional information available besides the single cell gene expression counts, such as bulk transcriptome data from the same tissue, or quantification of surface protein levels from the same cells. In this study, we propose models based on the Bayesian generative approach, where protein quantification available as CITE-seq counts from the same cell… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
5
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
4

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(5 citation statements)
references
References 25 publications
0
5
0
Order By: Relevance
“…Sample RNA-Sequencing data was provided with courtesy by Dr Ville Hautamäki who is the author of paper (Trung Ngo Trong et al, 2020), in which Bayesian inference method has also been used to get the latent values. However, in that paper, since the cells coming from a common group are not taken into account, they make use of another distribution (Kingma and Welling, 2014) to get the posterior probability.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Sample RNA-Sequencing data was provided with courtesy by Dr Ville Hautamäki who is the author of paper (Trung Ngo Trong et al, 2020), in which Bayesian inference method has also been used to get the latent values. However, in that paper, since the cells coming from a common group are not taken into account, they make use of another distribution (Kingma and Welling, 2014) to get the posterior probability.…”
Section: Methodsmentioning
confidence: 99%
“…However, in that paper, since the cells coming from a common group are not taken into account, they make use of another distribution (Kingma and Welling, 2014) to get the posterior probability. Our method of Bayesian inference to get the posterior probability for the values of gene expression which needs imputation relies a lot on the batch or group effect, and so the formula is designed accordingly unlike in Trung Ngo Trong et al, 2020, which did not take batch effect into account for their advantage. The sample data is also provided in the GitHub link specified to download the Python script.…”
Section: Methodsmentioning
confidence: 99%
“…Previous work derives a linear cutoff on the number of counts for each protein based on spiked-in cells that do not express the proteins specifically recognized by the barcoded antibodies [3]. Others fit mixture models to each protein, which assumes that all cells are subject to the same background distribution [11,12]. Our approach, which obviates the need for negative control cells or the assumption of a constant background distribution, models each y nt |z n , µ nt as a negative binomial mixture, where the Bernoulli parameter π nt |z n can be interpreted as the probability that a cell's protein count came from the background.…”
Section: Protein Background Disentanglementmentioning
confidence: 99%
“…As a result, the distribution of protein counts across cells is often bimodal with a background and foreground component. A natural way to preprocess the protein counts is to fit a mixture model to each protein globally and replace a count with its probability of being generated from the larger component [11,12]. However, there is no basis for assuming global bimodality of protein counts since a dataset typically comprises heterogeneous populations of cells each with distinct surface proteomes.…”
mentioning
confidence: 99%
“…In the first phase, we train two deep variational autoencoder (VAE) neural networks that learn, in an unsupervised fashion, to reduce each given type of data to a latent representation (the encoder) and then expand that representation to recover the original data (the decoder). (V)AEs have already been successfully applied to scRNA-seq and scATAC-seq data, primarily for the purpose of de-noising [11][12][13][14][15][16][17]. Polarbear trains one VAE for each type of data, while taking into consideration sequencing depth and batch factors [15,16].…”
Section: Introductionmentioning
confidence: 99%