The massive growth of single-cell RNA-sequencing (scRNAseq) and methods for its analysis still 1 lacks sufficient and up-to-date benchmarks that would guide analytical choices. Moreover, current 2 studies are often focused on isolated steps of the process. Here, we present a flexible R framework 3 for pipeline comparison with multi-level evaluation metrics and apply it to the benchmark of 4 scRNAseq analysis pipelines using datasets with known cell identities. We evaluate common steps 5 of such analyses, including filtering, doublet detection, normalization, feature selection, denoising, 6 dimensionality reduction and clustering. On the basis of these analyses, we make a number of 7 concrete recommendations about analysis choices. The evaluation framework, pipeComp, has 8 been implemented so as to easily integrate any other step or tool, allowing extensible benchmarks 9 and easy application to other fields (https://github.com/plger/pipeComp). 10 Background 11 Single-cell RNA-sequencing (scRNAseq) and the set of attached analysis methods are evolving 12 fast, with more than 560 software tools available to the community [1] , roughly half of which are 13 dedicated to tasks related to data processing such as clustering, ordering, dimension reduction 14 or normalization. This increase in the number of available tools follows the development of new 15 sequencing technologies and the growing number of reported cells, genes and cell populations [2] . 16 As data processing is a critical step in any scRNAseq analysis, affecting downstream analysis and 17 interpretation, it is critical to evaluate the available tools.
18A number of good comparison and benchmark studies have already been performed on vari-19 ous steps related to scRNAseq processing and analysis and can guide the choice of methodology 20 ] ). However these recommendations need constant up-21 dating and often leave open many details of an analysis. Another missing aspect of current 22 benchmarking studies is their limitation to capture all aspects of scRNAseq processing workflow. 23 Although previous benchmarks already brought valuable recommendations for data processing, 24 some only focused on one aspect of data processing (e.g., [14] ), did not evaluate how the tool se-25 lection affects downstream analysis (e.g., [17] ) or did not tackle all aspects of data processing, such 26 as doublet identification or cell filtering (e.g., [18] ). A thorough evaluation of the tools covering all 27 major processing steps is however urgently needed as previous benchmarking studies highlighted 28 that a combination of tools can have a drastic impact on downstream analysis, such as differential 29 expression analysis and cell-type deconvolution [18,3] . It is then critical to evaluate not only the 30 single effect of a preprocessing method but also its positive or negative interaction with all parts 31 of a workflow.
32Here, we develop a flexible R framework for pipeline comparison and evaluate the various 33 steps of analysis leading from an initial count matrix to a clu...