Analysing models of biological networks typically relies on workflows in which different software tools with sensitive parameters are chained together, many times with additional manual steps.The accessibility and reproducibility of such workflows is challenging, as publications often overlook analysis details, and because some of these tools may be difficult to install, and/or have a steep learning curve. The CoLoMoTo Interactive Notebook provides a unified environment to edit, execute, share, and reproduce analyses of qualitative models of biological networks. This framework combines the power of different technologies to ensure repeatability and to reduce users' learning curve of these technologies. The framework is distributed as a Docker image with the tools ready to be run without any installation step besides Docker, and is available on Linux, macOS, and Microsoft Windows. The embedded computational workflows are edited with a Jupyter web interface, enabling the inclusion of textual annotations, along with the explicit code to execute, as well as the visualisation of the results.The resulting notebook files can then be shared and re-executed in the same environment. To date, the CoLoMoTo Interactive Notebook provides access to software tools including GINsim, BioLQM, Pint, MaBoSS, and Cell Collective for the modelling and analysis of Boolean and multi-valued networks.More tools will be included in the future. We developed a Python interface for each of these tools to offer a seamless integration in the Jupyter web interface and ease the chaining of complementary analyses.
Python programming languageRecently, the scientific community has been increasingly concerned about difficulties in reproducing already published results. In the context of preclinical studies, observed difficulties to reproduce important findings have raised controversy (see e.g. [7,15,40,43], and [8] for a review on this topic). Although not invalidating the findings, these observations have shaken the community. In 2016, a Nature survey pointed to the multi-factorial origin of this "reproducibility crisis" [4]. Factors related to computational analyses were highlighted, in particular the unavailability of code and methods, along with the technical expertise required to reproduce the computations.The scientific community is progressively addressing this problem. Prestigious conferences (such as two major conferences from the database community, namely, VLDB 1 and SIGMOD 2 ) and journals such as PNAS 3 , Biostatistics [38], Nature [41] and Science [54], to name only a few, now encourage or even require published results to be accompanied by all the information necessary to reproduce them. While the reproducibility challenges have first been observed in domains where deluge of data were quickly becoming available (e.g., Next Generation Sequencing data analyses), the problem is now present in many (if not all) communities where computational analyses and simulations are performed. In particular, the System Biology community is facing a pro...