2021
DOI: 10.1093/gigascience/giaa140
|View full text |Cite
|
Sign up to set email alerts
|

Streamlining data-intensive biology with workflow systems

Abstract: As the scale of biological data generation has increased, the bottleneck of research has shifted from data generation to analysis. Researchers commonly need to build computational workflows that include multiple analytic tools and require incremental development as experimental insights demand tool and parameter modifications. These workflows can produce hundreds to thousands of intermediate files and results that must be integrated for biological insight. Data-centric workflow systems that internally manage c… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
47
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
8
1

Relationship

1
8

Authors

Journals

citations
Cited by 40 publications
(52 citation statements)
references
References 100 publications
0
47
0
Order By: Relevance
“…Indeed, there is currently a new revolution, as shown by the exponential growth of publications, regarding the possibility to sequence DNA and RNA from since-cells, as well as single organelles (mitochondria and chloroplasts) [13]. Special emphasis is now focused on integrating different -omics technologies, such as genomics (usually, DNA), transcriptomics (RNA), proteomics (peptides, like proteins), epigenomics (epigenetic factors) and metabolomics (metabolites), that eventually influence phenotypes in health and disease [14][15][16][17]. Furthermore, a combination of multi-omics techniques, complemented with morphological and physiological ones, allows a holistic approach to deciphering biological systems [18,19].…”
Section: Applications Of Nucleic-acid Sequencingmentioning
confidence: 99%
“…Indeed, there is currently a new revolution, as shown by the exponential growth of publications, regarding the possibility to sequence DNA and RNA from since-cells, as well as single organelles (mitochondria and chloroplasts) [13]. Special emphasis is now focused on integrating different -omics technologies, such as genomics (usually, DNA), transcriptomics (RNA), proteomics (peptides, like proteins), epigenomics (epigenetic factors) and metabolomics (metabolites), that eventually influence phenotypes in health and disease [14][15][16][17]. Furthermore, a combination of multi-omics techniques, complemented with morphological and physiological ones, allows a holistic approach to deciphering biological systems [18,19].…”
Section: Applications Of Nucleic-acid Sequencingmentioning
confidence: 99%
“…The notebook generates Pandas DataFrames for the feature table, taxonomy, reference sequence properties, and metadata, and a Pandas Series for alpha diversity. Static plots are generated from some of these tables using Seaborn (Qalieh et al 2017).…”
Section: Jupyter Notebooksmentioning
confidence: 99%
“…A popular approach for standardizing amplicon data analysis is to develop analysis pipelines or workflows (Prodan et al 2020;Reiter et al 2021) to run on local or networked computing resources or in the cloud. Examples include Anacapa (Curd et al 2019), Banzai (https://github.com/jimmyodonnell/banzai), PEMA (Zafeiropoulos et al 2020), Ampliseq (Straub et al 2020), Cascabel (Asbun et al 2019), and dadasnake (Weißbecker, Schnabel and Heintz-Buschart 2020).…”
Section: Introductionmentioning
confidence: 99%
“…Lessons learned were summarized and developed into the following interconnected (see Fig 1) Ten Simple Rules within a "collaboratory cultures" framework. While many Ten Simple Rules have been written about general collaboration, data sciences collaboration, statisticians' collaborations, and leveraging big data [2][3][4][5][6][7], we emphasize the "nontechnical" criteria that are necessary to promote effective collaborations, accelerate discovery, facilitate new partnerships, and develop the role of individuals within transdisciplinary [8] research projects-projects that combine disciplines in a nontraditional way, resulting in the development of novel frameworks, concepts, and methodologies to address scientific problems.…”
Section: Introductionmentioning
confidence: 99%