BackgroundNext-generation sequencing (NGS) enables rapid production of billions of bases at
a relatively low cost. Mapping reads from next-generation sequencers to a given
reference genome is an important first step in many sequencing applications.
Popular read mappers, such as Bowtie and BWA, are optimized to return top one or a
few candidate locations of each read. However, identifying all mapping locations
of each read, instead of just one or a few, is also important in some sequencing
applications such as ChIP-seq for discovering binding sites in repeat regions, and
RNA-seq for transcript abundance estimation.ResultsHere we present Hobbes2, a software package designed for fast and accurate
alignment of NGS reads and specialized in identifying all mapping locations of
each read. Hobbes2 efficiently identifies all mapping locations of reads using a
novel technique that utilizes additional prefix q-grams to improve
filtering. We extensively compare Hobbes2 with state-of-the-art read mappers, and
show that Hobbes2 can be an order of magnitude faster than other read mappers
while consuming less memory space and achieving similar accuracy.ConclusionsWe propose Hobbes2 to improve the accuracy of read mapping, specialized in
identifying all mapping locations of each read. Hobbes2 is implemented in C++, and
the source code is freely available for download at
http://hobbes.ics.uci.edu.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.