jordan.squair@epfl.ch a, Schematic overview of Augur. b, AUCs of Augur and a naive random forest classifier without subsampling in simulated scRNA-seq datasets containing increasing numbers of cells. Cell type prioritizations are confounded by training dataset size for the naive classifier, but Augur abolishes this confounding factor. The mean and standard deviation of ten simulation replicates are shown. c, Pearson correlations between the AUC of each cell type, and the number of cells of that type sequenced, across a compendium of 22 scRNA-seq datasets, for Augur and a naive random forest classifier without subsampling. d, Augur AUCs scale monotonically with both the proportion of DE genes and the magnitude of DE in simulated cell populations. e, Relationship between number of DE genes detected by a representative test for single-cell differential gene expression (Wilcoxon rank-sum test), and the proportion of differentially expressed genes simulated between the two populations, for simulated populations of between 100 and 1,000 cells. f, Augur cell type prioritizations track with duration of LPS exposure in a cross-species scRNA-seq experiment 7 . Grey points show AUCs with sample labels randomly permuted. g-h, Cell type prioritization in matched single-cell 4 and bulk 8 transcriptomic profiles of PBMCs after interferon stimulation. g, Left, Augur cell type prioritizations mirror the number of DE genes in a microarray dataset of FACS-purified cells. Right, the number of DE genes detected in the scRNA-seq dataset by a Wilcoxon rank-sum test is uncorrelated with the FACS gold standard. h, Correlation coefficients between cell type prioritizations (AUC or number of DE genes) in the scRNA-seq dataset and the FACS gold standard. i-j, Cell type prioritization in the mouse ventromedial hypothalamus reflects induction of IEG transcription. i, Correlation between AUC and the difference in the first principal component of IEG expression (∆IEG eigengene) engaging in aggressive behavior. j, Pearson correlation coefficients between cell type-specific AUC and ∆IEG eigengene values for eleven behavioral stimuli 9 . k, Reproducibility of cell type prioritization in two independent scRNA-seq studies of Alzheimer's disease 5,10 . l, Augur cell type prioritizations in a scATAC-seq dataset 11 track with the number of DE genes in an RNA-seq dataset of FACS-purified cells.