2020
DOI: 10.1371/journal.pcbi.1007794
|View full text |Cite
|
Sign up to set email alerts
|

Scedar: A scalable Python package for single-cell RNA-seq exploratory data analysis

Abstract: 1 Supplementary methods 1.1 scedar package development Scedar is built upon various high-performance scientific computing and visualization packages.Scedar is also extensively benchmarked and tested by unit testing, with comprehensive coverage on statements and branches.Scedar uses the following packages:• numpy (Travis E. Oliphant, 2006) for matrix representation and operations. • scipy (Virtanen et al., 2018) for fast Gaussian kernel density estimation, hierarchical clustering and sparse matrix. * yuanchao.z… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 12 publications
(5 citation statements)
references
References 72 publications
0
5
0
Order By: Relevance
“…A key analytical need for the PCA, analyses of longitudinal datasets, would be driven by PCA researcher needs. Computational methods with the ability to cluster and visualize cellular heterogeneity across millions of cells have been recently introduced (Cho et al, 2018, Wolf et al, 2018, Zhang and Taylor, 2018), which could be adapted for longitudinal data, for instance using machine learning approaches (Hu and Greene, 2018, van Dijk et al, 2018, Lin et al, 2017, Amodio et al, 2017, Schiebinger et al, 2017, Schiebinger et al, 2019) to help map cell lineages and trajectories particular to pediatric development.…”
Section: Expected Challenges and Timelinesmentioning
confidence: 99%
“…A key analytical need for the PCA, analyses of longitudinal datasets, would be driven by PCA researcher needs. Computational methods with the ability to cluster and visualize cellular heterogeneity across millions of cells have been recently introduced (Cho et al, 2018, Wolf et al, 2018, Zhang and Taylor, 2018), which could be adapted for longitudinal data, for instance using machine learning approaches (Hu and Greene, 2018, van Dijk et al, 2018, Lin et al, 2017, Amodio et al, 2017, Schiebinger et al, 2017, Schiebinger et al, 2019) to help map cell lineages and trajectories particular to pediatric development.…”
Section: Expected Challenges and Timelinesmentioning
confidence: 99%
“…An extensive empirical study has been carried out by comparing the performance of contrastive-sc with 11 alternative techniques, representing both methods requiring or not the number of clusters as input. ScziDesk [19], scDeepClustering [16], scRNA [11], cidr [8] and soup [12] take as input the expected number of clusters while Seurat [13] (scanpy [14] implementation), desc [18], scedar [32], raceid [33] and scvi [20] perform clustering without any alternative information. Additionally, a naive baseline method consisting of clustering with KMeans the first 2 principal components of the expression matrix has been assessed.…”
Section: Competing Methodsmentioning
confidence: 99%
“…Our experimental setup compared the performance of graph-sc with 12 competing methods, representative of both scenarios. ScziDesk (Chen et al, 2020), scDeepClustering (Tian et al, 2019), scRNA (Mieth et al, 2019), cidr (Lin et al, 2017) and soup (Zhu et al, 2019) take as input the expected number of clusters while scGNN (Wang et al, 2021), Seurat (Satija et al, 2015), scanpy (Wolf et al, 2018) implementation), desc (Li et al, 2020), scedar (Zhang et al, 2020), raceid (Muraro et al, 2016) and scvi (Lopez et al, 2018) perform clustering without any alternative information. In addition, 6 naive baselines (depicted in gray in all our plots) consisting of clustering with K-means the following dimensionality reduced version of the expression matrix were assessed: the first 2 (labeled pca2_kmeans) and 50 (labelled pca50_kmeans) principal components of X, the first 20 (umap20_kmeans) or 50 (umap50_kmeans) UMAP, the first 2 UMAP components of the 50 PCA (pca50_umap_kmeans) of X and with Leiden the best performing baseline, the 2 UMAP components of the 50 PCA of X (labelled pca50_umap_leiden).…”
Section: Competing Methodsmentioning
confidence: 99%