A diffusion MRI (dMRI) tractography processing pipeline should be: i) reproducible in immediate test-test, ii) reproducible in time, iii) efficient and iv) easy to use. Two runs of the same processing pipeline with the same input data should give the same output today, tomorrow and in 2 years. However, processing dMRI data requires a large number of steps (20+ steps) that, at this time, may not be reproducible between runs or over time. If parameters such as the number of threads or the random number generator are not carefully set in the brain extraction, registration and fiber tracking steps, the end tractography results obtained can be far from reproducible and limit brain connectivity studies. Moreover, processing can take several hours to days of computation for a large database, even more so if the steps are running sequentially.To handle these issues, we present TractoFlow, a fully automated pipeline that processes datasets from the raw diffusion weighted images (DWI) to tractography. It also outputs classical diffusion tensor imaging measures (fractional anisotropy (FA) and diffusivities) and several HARDI measures (Number of Fiber Orientation (NuFO), Apparent Fiber Density (AFD)). The pipeline requires a DWI and T1-weighted image as NIfTI files and b-values/b-vectors in FSL format. An optional reversed phase encoded b=0 image can also be used. This pipeline is based on two technologies: Nextflow and Singularity, as well as recommended pre-processing and processing steps from the dMRI community. In this work, the TractoFlow pipeline is evaluated on three databases and shown to be efficient and reproducible from 98% to 100% depending on parameter choices. For example, 105 subjects from the Human Connectome Project (HCP) were fully ran in twenty-five (25) hours to produce, for each subject, a whole-brain tractogram with 4 million streamlines. The contribution of this paper is to introduce the importance of a robust pipeline in terms of runtime and reproducibility over time. In the era of open data and open science, efficiency and reproducibility is critical in neuroimaging projects. Our TractoFlow pipeline is publicly available for academic research and is an important step forward for better structural brain connectivity mapping.Diffusion magnetic resonance imaging (dMRI) is the main technique to non-invasively 2 obtain information about the white matter. Diffusion MRI is currently at the core of 3 structural connectivity or white matter brain mapping [Van Essen et al., 2012], using 4 dMRI tractography [Descoteaux et al., 2009; Girard et al., 2014] to reconstruct and vi-5 sualize the white matter architecture [Maier-Hein et al., 2017; Jeurissen et al., 2017]. 6 However, there could be from 20 to 25 processing steps involved to perform dMRI trac-7 tography from a raw dMRI data.
8The data processing depends on many tools in different packages with many parame-9 ters. Setting up the environment and installing all dependencies to process the data can 10 be long, tedious and difficult, even more so for a be...