IMP: a pipeline for reproducible reference-independent integrated metagenomic and metatranscriptomic analyses

Narayanasamy, Shaman; Jarosz, Yohan; Muller, Emilie; Heintz‐Buschart, Anna; Herold, Malte; Kaysen, Anne; Laczny, Cédric Christian; Pinel, Nicolás; May, Patrick; Wilmes, Paul

doi:10.1186/s13059-016-1116-8

Cited by 125 publications

(117 citation statements)

References 78 publications

(186 reference statements)

Supporting

Mentioning

117

Contrasting

Order By: Relevance

“…On the other hand, ensuring reproducibility does not come for free: anecdotic evidence suggests that the time spent on a project may increase by 30-50% [1], and that to reproduce the analysis of single computational biology paper can require up to 280 hours [57]. YAMP, along with other containarised workflows, such as the Integrated Meta-omic Pipeline (IMP) [58] and Bio-Docklets [59], represents a proof-of-concept showing a simple way to enable reproducible and collaborative research. We also advocate the sharing of such containerised workflows, which will benefit a wide group of researchers, regardless of their computational experience [11].…”

Section: Discussionmentioning

confidence: 99%

YAMP: a containerised workflow enabling reproducibility in metagenomics research

Visconti

Martin

Falchi

2017

Preprint

View full text Add to dashboard Cite

Abstract. YAMP is a user-friendly workflow that enables the analysis of whole shotgun metagenomics data while using containerisation to ensure computational reproducibility and facilitate collaborative research. YAMP can be executed on any UNIX-like system, and offers seamless support for multiple job schedulers as well as for Amazon AWS cloud. Although YAMP has been developed to be ready-to-use by non-experts, bioinformaticians will appreciate its flexibility, modularisation, and simple customisation. The YAMP script, parameters, and documentation are available at https://github.com/alesssia/YAMP.

show abstract

Section: Discussionmentioning

confidence: 99%

YAMP: a containerised workflow enabling reproducibility in metagenomics research

Visconti

Martin

Falchi

2017

Preprint

View full text Add to dashboard Cite

show abstract

“…Whether the sole use of CLI is a strength or a weakness is a matter of taste, but adding a web interface would raise the height of the software dependency stack greatly, adding to system complexity, not only increasing the chance of bugs but also introducing security concerns that are inherent in web applications. 1 SyQADA has made no direct attempt to address "cloud" computing in contrast to Galaxy, IMP, and Omics Pipe (Fisch et al, 2015;Goecks et al, 2010;Narayanasamy et al, 2016), among others. Those with institutional access to large-scale computing resources may not yet have encountered a need for this.…”

Section: Samqc-2mentioning

confidence: 99%

“…There are several related systems for managing bioinformatics workflows; from an analyst's perspective, they fall into the following four categories based on human interface: Command‐line interface (CLI) Bpipe (Sadedin, Pope & Oshlack, ), NGSANE (Buske, French, Smith, Clark & Bauer, ), Omics Pipe (Fisch et al, ); Standalone application with graphic interface (app) NEAT (Schorderet, ), Chipster (Kallio et al, ), GenePattern (Kuehn, Liberzon, Reich & Mesirov, ); Web application (webapp) Galaxy (Goecks et al, ), IMP (Narayanasamy et al, ); Application program interface (API) GATK (McKenna et al, ), Queue (Shakir, ), Omics Pipe (Fisch et al, ), Ruffus (Goodstadt, ). SyQADA is a CLI system whose only dependencies are a Unix operating system and a standard installation of Python 3.5 (or higher). This simplicity by design is shared by Bpipe, NGSANE and to some extent by Omics Pipe.…”

Section: Introductionmentioning

confidence: 99%

“…Command-line interface (CLI) Bpipe (Sadedin, Pope & Oshlack, 2012), NGSANE (Buske, French, Smith, Clark & Bauer, 2014), Omics Pipe (Fisch et al, 2015); Standalone application with graphic interface (app) NEAT (Schorderet, 2016), Chipster (Kallio et al, 2011), GenePattern (Kuehn, Liberzon, Reich & Mesirov, 2008); Web application (webapp) Galaxy (Goecks et al, 2010), IMP (Narayanasamy et al, 2016); Application program interface (API) GATK (McKenna et al, 2010), Queue (Shakir, 2011), Omics Pipe (Fisch et al, 2015), Ruffus (Goodstadt, 2010).…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

System for Quality‐Assured Data Analysis: Flexible, reproducible scientific workflows

Fowler

Lucas

Scheet

2018

Genetic Epidemiology

View full text Add to dashboard Cite

The reproducibility of scientific processes is one of the paramount problems of bioinformatics, an engineering problem that must be addressed to perform good research. The System for Quality-Assured Data Analysis (SyQADA), described here, seeks to address reproducibility by managing many of the details of procedural bookkeeping in bioinformatics in as simple and transparent a manner as possible. SyQADA has been used by persons with backgrounds ranging from expert programmer to Unix novice, to perform and repeat dozens of diverse bioinformatics workflows on tens of thousands of samples, consuming over 80 CPU-months of computing on over 300,000 individual tasks of scores of projects on laptops, computer servers, and computing clusters. SyQADA is especially well-suited for paired-sample analyses found in cancer tumor-normal studies. SyQADA executable source code, documentation, tutorial examples, and workflows used in our lab is available from http://scheet.org/software.html.

show abstract

“…A number of assembly-based metagenome pipelines have been developed, each providing a subset of the required tools needed to carry out a complete analysis process from raw data to annotated genomes (14)(15)(16)(17). For example, MOCAT (16) relies on gene catalogs to evaluate the functional potential of the metagenome as a whole, but without directly relating functions to individual microbes.…”

Section: Introductionmentioning

confidence: 99%

ATLAS: a Snakemake workflow for assembly, annotation, and genomic binning of metagenome sequence data

Kieser

Brown

Zdobnov

et al. 2019

Preprint

View full text Add to dashboard Cite

Background:Metagenomics and metatranscriptomics studies provide valuable insight into the composition and function of microbial populations from diverse environments, however the data processing pipelines that rely on mapping reads to gene catalogs or genome databases for cultured strains yield results that underrepresent the genes and functional potential of uncultured microbes.Recent improvements in sequence assembly methods have eased the reliance on genome databases, thereby allowing the recovery of genomes from uncultured microbes. However, configuring these tools,

show abstract

IMP: a pipeline for reproducible reference-independent integrated metagenomic and metatranscriptomic analyses

Cited by 125 publications

References 78 publications

YAMP: a containerised workflow enabling reproducibility in metagenomics research

YAMP: a containerised workflow enabling reproducibility in metagenomics research

System for Quality‐Assured Data Analysis: Flexible, reproducible scientific workflows

ATLAS: a Snakemake workflow for assembly, annotation, and genomic binning of metagenome sequence data

Contact Info

Product

Resources

About