2018
DOI: 10.12688/f1000research.15666.2
|View full text |Cite
|
Sign up to set email alerts
|

A systematic performance evaluation of clustering methods for single-cell RNA-seq data

Abstract: Subpopulation identification, usually via some form of unsupervised clustering, is a fundamental step in the analysis of many single-cell RNA-seq data sets. This has motivated the development and application of a broad range of clustering methods, based on various underlying algorithms. Here, we provide a systematic and extensible performance evaluation of 14 clustering algorithms implemented in R, including both methods developed explicitly for scRNA-seq data and more general-purpose methods. The methods were… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

3
165
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 255 publications
(168 citation statements)
references
References 54 publications
3
165
0
Order By: Relevance
“…This method is the default clustering method implemented in the Scanpy and Seurat single-cell analysis platforms. It has been shown to outperform other clustering methods for single-cell RNAseq data (Duò et al, 2018;Freytag et al, 2018), and flow and mass cytometry data (Weber & Robinson, 2016). Conceptually, the Louvain algorithm detects communities as groups of cells that have more links between them than expected from the number of links the cells have in total.…”
Section: Cluster Analysis Clusteringmentioning
confidence: 99%
“…This method is the default clustering method implemented in the Scanpy and Seurat single-cell analysis platforms. It has been shown to outperform other clustering methods for single-cell RNAseq data (Duò et al, 2018;Freytag et al, 2018), and flow and mass cytometry data (Weber & Robinson, 2016). Conceptually, the Louvain algorithm detects communities as groups of cells that have more links between them than expected from the number of links the cells have in total.…”
Section: Cluster Analysis Clusteringmentioning
confidence: 99%
“…The worst rank of 5.5 came from the Biase dataset, where only one cell was misplaced by SAME, leading to a high ARI but didn't perform well rank-wise because there were 3 methods that achieved perfect clustering when compared to the "gold-standard". We also compare our results to our previously published SAFE method (12), which performed overall second best and remains an attractive alternative (33), particularly when analyzing large datasets to save computation time.…”
Section: Resultsmentioning
confidence: 92%
“…In their systematic evaluation of clustering methods, Duò et al included a method in which cells are clustered by applying the k-means algorithm to t-SNE results. However, kmeans is not a suitable choice for clustering data after applying a non-linear dimensionality reduction method such as t-SNE that only preserves local, but not global distances 5 . It is sometimes believed that t-SNE results have merely "illustrative" character and are not sufficiently reliable to justify their use in clustering.…”
Section: Discussionmentioning
confidence: 99%
“…The unique characteristics of single-cell RNA-Seq data, in particular the high intrinsic levels of technical noise [2][3][4] , have motivated the development of specialized scRNA-Seq clustering methods. A wide array of methods has been proposed, and a recent study has systematically compared twelve clustering methods on real and simulated data 5 .…”
Section: Introductionmentioning
confidence: 99%