2019
DOI: 10.1186/s13059-019-1854-5
|View full text |Cite
|
Sign up to set email alerts
|

Assessment of computational methods for the analysis of single-cell ATAC-seq data

Abstract: Background: Recent innovations in single-cell Assay for Transposase Accessible Chromatin using sequencing (scATAC-seq) enable profiling of the epigenetic landscape of thousands of individual cells. scATAC-seq data analysis presents unique methodological challenges. scATAC-seq experiments sample DNA, which, due to low copy numbers (diploid in humans), lead to inherent data sparsity (1-10% of peaks detected per cell) compared to transcriptomic (scRNA-seq) data (10-45% of expressed genes detected per cell). Such … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

4
320
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 282 publications
(324 citation statements)
references
References 36 publications
4
320
0
Order By: Relevance
“…Despite the analyses listed for bulk ATAC-seq, another important analysis for single-cell is clustering. A recent benchmarking study from Chen et al about clustering methods in scATAC-seq showed that SnapATAC, Cusanovich2018 and cisTopic outperformed other methods [23,[173][174][175]. These three methods are featured by workflows combining window-based genome binning, binarization of the accessibility, coverage bias correction, and dimension reduction using principle component analysis, which specifically handle the sparse scATAC-seq data [175].…”
Section: Single-cell Atac-seqmentioning
confidence: 99%
“…Despite the analyses listed for bulk ATAC-seq, another important analysis for single-cell is clustering. A recent benchmarking study from Chen et al about clustering methods in scATAC-seq showed that SnapATAC, Cusanovich2018 and cisTopic outperformed other methods [23,[173][174][175]. These three methods are featured by workflows combining window-based genome binning, binarization of the accessibility, coverage bias correction, and dimension reduction using principle component analysis, which specifically handle the sparse scATAC-seq data [175].…”
Section: Single-cell Atac-seqmentioning
confidence: 99%
“…type identification as a downstream task of DR. Both measures are frequently used to assess the performance of cell embedding techniques [18,11,20,23,10]. Another standard measure to assess the ability of a clustering algorithm to recover known classes is the adjusted rand index (ARI) [11,20,23,10]; however, we found that AMI and ARI are extremely correlated (Additional file 1: Fig.…”
Section: Plates Basedmentioning
confidence: 93%
“…For Louvain, the nearest neighbors graph construction was done with scanpy (version 1.3.8) using default parameters, and the clustering was also run with default parameters using the 'taynaud' flavor. Note that since we had to automatically run Louvain on all embeddings, as done in [23], we could not properly tune the resolution nor the size of the neighborhood and thus Louvain could either overcluster or undercluster.…”
Section: Clustering Algorithmsmentioning
confidence: 99%
“…We generated simulated data with different levels of noise from the bulk ATAC-seq data of 13 primary human blood cell types [8] using the same strategy as that in [47]. We started with the bulk peak-by-cell count matrix and generated count for peak i in cell type t using a binomial distribution binomð2; p t i Þ, where p t i ¼ ð1−qÞr t i =2 þ qn=2k, r t i is the percentage of all reads overlapping with peak i in cell type t, k is the total number of peaks in the bulk data, n is the number of simulated fragments, and q is a parameter specifying the level of noise; q = 0 indicates no noise while q = 1 indicates the highest level of noise.…”
Section: Scratmentioning
confidence: 99%