2014
DOI: 10.1093/bioinformatics/btu558
|View full text |Cite
|
Sign up to set email alerts
|

BioBloom tools: fast, accurate and memory-efficient host species sequence screening using bloom filters

Abstract: Large datasets can be screened for sequences from a specific organism, quickly and with low memory requirements, by a data structure that supports time- and memory-efficient set membership queries. Bloom filters offer such queries but require that false positives be controlled. We present BioBloom Tools, a Bloom filter-based sequence-screening tool that is faster than BWA, Bowtie 2 (popular alignment algorithms) and FACS (a membership query algorithm). It delivers accuracies comparable with these tools, contro… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
91
0

Year Published

2014
2014
2024
2024

Publication Types

Select...
9
1

Relationship

3
7

Authors

Journals

citations
Cited by 107 publications
(91 citation statements)
references
References 9 publications
0
91
0
Order By: Relevance
“…In the first stage we classify read sequences using a pipeline based on BBT (Release 1.2.10), a fast Bloom filter-based method (Chu et al, 2014). For 48-bp PE RNAseq data, we processed data for 408 tumor samples and 19 tissue normal samples; for 76-bp PE WES data we processed 412 tumor samples and 429 blood or tissue normals; and for 51-bp PE WGS data (high and low pass) we processed 136 tumor samples and 145 blood or tissue normal samples.…”
Section: Star Methodsmentioning
confidence: 99%
“…In the first stage we classify read sequences using a pipeline based on BBT (Release 1.2.10), a fast Bloom filter-based method (Chu et al, 2014). For 48-bp PE RNAseq data, we processed data for 408 tumor samples and 19 tissue normal samples; for 76-bp PE WES data we processed 412 tumor samples and 429 blood or tissue normals; and for 51-bp PE WGS data (high and low pass) we processed 136 tumor samples and 145 blood or tissue normal samples.…”
Section: Star Methodsmentioning
confidence: 99%
“…The microbial detection pipeline is based on BioBloomTools (BBT, v1.2.4.b1), which is a Bloom filter-based method for rapidly classifying RNA-seq or DNA-seq read sequences (Chu et al, 2013). We generated 43 filters from ‘complete’ NCBI genome reference sequences of bacteria, viruses, fungi and protozoa, using 25-bp k-mers and a false positive rate of 0.02.…”
Section: Methodsmentioning
confidence: 99%
“…We also looked for the presence of viral pathogens in PTC using two independent methods to assess the RNA-seq data: PathSeq (Kostic et al, 2011) and BioBloom Tools (Chu et al, 2014). We identified two tumors with hepatitis B virus (HBV) and one tumor with human papillomavirus 45 (HPV45) at relative frequencies exceeding 0.1 viral reads per million human reads (RPM) for PathSeq and 0.2 RPM for BBT (see Supplement and Tables S4G-I), indicating that viral pathogens are unlikely significant contributors to PTC pathogenesis.…”
Section: Resultsmentioning
confidence: 99%