René Rahn scite author profile

Experiments in the life sciences often involve tools from a variety of domains such as mass spectrometry, next generation sequencing, or image processing. Passing the data between those tools often involves complex scripts for controlling data flow, data transformation, and statistical analysis. Such scripts are not only prone to be platform dependent, they also tend to grow as the experiment progresses and are seldomly well documented, a fact that hinders the reproducibility of the experiment. Workflow systems such as KNIME Analytics Platform aim to solve these problems by providing a platform for connecting tools graphically and guaranteeing the same results on different operating systems. As an open source software, KNIME allows scientists and programmers to provide their own extensions to the scientific community. In this review paper we present selected extensions from the life sciences that simplify data exploration, analysis, and visualization and are interoperable due to KNIME's unified data model. Additionally, we name other workflow systems that are commonly used in the life sciences and highlight their similarities and differences to KNIME.

show abstract

The SeqAn C++ template library for efficient sequence analysis: A resource for programmers

Reinert

Dadi

Ehrhardt

et al. 2017

Journal of Biotechnology

View full text Add to dashboard Cite

show abstract

Generic accelerated sequence alignment in SeqAn using vectorization and multi-threading

Rahn

Budach

Costanza

et al. 2018

View full text Add to dashboard Cite

show abstract

Journaled string tree—a scalable data structure for analyzing thousands of similar genomes on your laptop

Rahn

Weese

Reinert

2014

View full text Add to dashboard Cite

show abstract

Needle: a fast and space-efficient prefilter for estimating the quantification of very large collections of expression experiments

Darvish

Seiler

Mehringer³

et al. 2022

View full text Add to dashboard Cite

Motivation The ever-growing size of sequencing data is a major bottleneck in bioinformatics as the advances of hardware development cannot keep up with the data growth. Therefore, an enormous amount of data is collected but rarely ever reused, because it is nearly impossible to find meaningful experiments in the stream of raw data. Results As a solution, we propose Needle, a fast and space-efficient index which can be built for thousands of experiments in less than two hours and can estimate the quantification of a transcript in these experiments in seconds, thereby outperforming its competitors. The basic idea of the Needle index is to create multiple interleaved Bloom filters that each store a set of representative k-mers depending on their multiplicity in the raw data. This is then used to quantify the query. Supplementary information Supplementary data are available at Bioinformatics online. Availability and implementation https://github.com/seqan/needle

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

René Rahn

KNIME for reproducible cross-domain analysis of life science data

The SeqAn C++ template library for efficient sequence analysis: A resource for programmers

Generic accelerated sequence alignment in SeqAn using vectorization and multi-threading

Journaled string tree—a scalable data structure for analyzing thousands of similar genomes on your laptop

Needle: a fast and space-efficient prefilter for estimating the quantification of very large collections of expression experiments

Contact Info

Product

Resources

About