Single-cell RNA-seq (scRNAseq) technologies are rapidly evolving and a growing number of datasets are now available. While very informative, in standard scRNAseq experiments the spatial organization of the cells in the organism or tissue of origin is lost. Conversely, spatial RNA-seq technologies designed to keep the localization of the cells have limited throughput and gene coverage. Mapping scRNAseq to data of genes with spatial information can thus increase coverage while providing spatial location. However, methods to perform such a mapping are still in their infancy and have not been benchmarked in an unbiased manner. To bridge the gap, we organized the DREAM Single-Cell Transcriptomics challenge to evaluate methods for reconstructing the spatial arrangement of single cells from single-cell RNA sequencing data. The challenge focused on the spatial reconstruction of cells from the Drosophila embryo from single-cell transcriptomic and, leveraging as gold standard, in situ hybridization data of a set of selected driver genes from the Berkeley Drosophila Transcription Network Project reference atlas. The 34 participating teams used an array of different algorithms for gene selection and location prediction. We devised a novel scoring and cross-validation scheme to evaluate the robustness of the best performing algorithms. Participants were able to correctly and robustly localize rare subpopulations of cells, accurately mapping both spatially co-localized and scattered groups of cells. The selection of predictor genes was essential for accurately locating the cells in the embryo. Among the most frequently selected set of genes we measured a relatively high expression entropy, high spatial clustering and the presence of prominent developmental genes such as gap and pair-ruled genes and tissue defining markers.
IntroductionThe recent technological advances in single-cell sequencing technologies have revolutionized 1 the biological sciences. In particular single-cell RNA sequencing (scRNAseq) methods allow 2 transcriptome profiling in a highly parallel manner, resulting in the quantification of thousands of 3 genes across thousands of cells of the same tissue. However, with a few exceptions [1, 2, 3, 4, 5] 4 current high-throughput scRNAseq methods share the drawback of losing the information relative 5 to the spatial arrangement of the cells in the tissue during the cell dissociation step. 6 One way of regaining spatial information computationally is to appropriately combine the single-7 cell RNA dataset at hand with a reference database, or atlas, containing spatial expression patterns 8 for several genes across the tissue. This approach was pursued in a few studies [6, 7, 8, 9, 10]. 9 Achim et al identified the location of 139 cells using 72 reference genes with spatial information 10 from whole mount in situ hybridization (WMISH) of a marine annelid and Satija et al developed 11 the Seurat algorithm to predict position of 851 zebrafish cells based on their scRNAseq data and 12 spatial information from in s...