12Experimental single-cell approaches are becoming widely used for many purposes, including inves-13 tigation of the dynamic behaviour of developing biological systems. Consequently, a large number 14 of computational methods for extracting dynamic information from such data have been developed. 15 One example is RNA velocity analysis, in which spliced and unspliced RNA abundances are jointly 16 modeled in order to infer a 'direction of change' and thereby a future state for each cell in the gene 17 expression space. 18 Naturally, the accuracy and interpretability of the inferred RNA velocities depend crucially on the 19 correctness of the estimated abundances. Here, we systematically compare four widely used quan-20 tification tools, in total yielding twelve different quantification approaches, in terms of their estimates 21 of spliced and unspliced RNA abundances in four experimental droplet scRNA-seq data sets. We
22show that there are substantial differences between the quantifications obtained from different tools, 23 and identify typical genes for which such discrepancies are observed. We further show that these 24 abundance differences propagate to the downstream analysis, and can have a large effect on estimated 25 velocities as well as the biological interpretation.
26Our results highlight that abundance quantification is a crucial aspect of the RNA velocity anal-27 ysis workflow, and that both the definition of the genomic features of interest and the quantification 28 algorithm itself require careful consideration. 29 Single-cell RNA-seq (scRNA-seq) enables high-throughput profiling of gene expression on a transcriptome-31 wide scale in individual cells. The increased resolution compared to bulk RNA-seq, where only average 32 expression profiles of populations of cells are obtained, provides vastly improved potential to study 33 a variety of biological questions. One such question concerns the dynamics of biological systems, re-34 flected in, e.g., cellular differentiation and development. While such dynamical processes would ideally 35 be studied via repeated transcriptome-wide expression profiling of the same cells over time, this is not 36 possible with current scRNA-seq protocols. Existing analysis methods for so called trajectory inference 37 are instead applied to one or several snapshots of a population of cells, assumed to comprise all stages 38 of the trajectory of interest. Many computational methods for trajectory inference from scRNA-seq have 39 been presented in the literature (reviewed by Saelens et al. (2019)). These methods typically use the 40 similarity of the gene expression profiles between cells to construct a (possibly branching) path through 41 the observed set of cells, representing the trajectory of interest. Projecting the cells onto this path then 42 provides an ordering of the cells by so called pseudotime. 43 A different approach to the investigation of developmental processes in scRNA-seq data instead 44 exploits the underlying molecular dynamics. The feasibility of ...