Recent technical developments have enabled the transcriptomes of hundreds of cells to be assayed in an unbiased manner, opening up the possibility that new subpopulations of cells can be found. However, the effects of potential confounding factors, such as the cell cycle, on the heterogeneity of gene expression and therefore on the ability to robustly identify subpopulations remain unclear. We present and validate a computational approach that uses latent variable models to account for such hidden factors. We show that our single-cell latent variable model (scLVM) allows the identification of otherwise undetectable subpopulations of cells that correspond to different stages during the differentiation of naive T cells into T helper 2 cells. Our approach can be used not only to identify cellular subpopulations but also to tease apart different sources of gene expression heterogeneity in single-cell transcriptomes.Single-cell measurements of gene expression, using imaging techniques such as RNA-FiSH (fluorescence in situ hybridization), have provided important insights into the kinetics of transcription and cell-to-cell variation in gene expression [1][2][3] . However, such approaches can examine the expression of only a small number of genes in each experiment, thus restricting our ability to examine co-expression patterns and to robustly identify subpopulations of cells. Protocols have been developed to overcome these limitations by amplifying small quantities of mRNA 4,5 , which, in combination with microfluidics approaches for isolating individual cells 6,7 , have been used to analyze the co-expression of tens to hundreds of genes in single cells 8,9 . These protocols also allow the entire transcriptome of large numbers of single cells to be assayed in an unbiased way. This was initially done using microarrays 10,11 but is more often now done using next-generation sequencing [12][13][14][15] . Such approaches have been used to model early embryogenesis in the mouse 16 and to investigate bimodality in gene expression patterns of differentiating immune cell types 17 .After the generation of single-cell RNA-sequencing (RNA-seq) profiles from hundreds of cells, one goal to identify subpopulations that share a common gene-expression profile. Some of these subpopulations may represent previously unidentified cell types. Additionally, by studying patterns of gene expression in different single cells, insights into the regulatory landscape of each cell population can be obtained.However, methods for identifying subpopulations of cells and modeling their gene regulatory landscapes are only now beginning to emerge 18,19 . To fully exploit single-cell RNA-seq data, we have to account for the random noise inherent to such data sets 20 and, equally important, to account for different hidden factors that might result in gene expression heterogeneity. Although the importance of accounting for unobserved factors is well established in bulk RNA-seq studies [21][22][23] , robust approaches to detect and account for confounding f...