2020
DOI: 10.1089/cmb.2019.0316
|View full text |Cite
|
Sign up to set email alerts
|

Matching Reads to Many Genomes with the r-Index

Abstract: The r-index is a tool for compressed indexing of genomic databases for exact pattern matching, which can be used to completely align reads that perfectly match some part of a genome in the database or to find seeds for reads that do not. This paper shows how to download and install the programs ri-buildfasta and ri-align ; how to call ri-buildfasta on a FASTA file to build an r-index for that file; and how to query that index with ri-align .Availability: The source code for these programs is released under GPL… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
9
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
4
1

Relationship

3
2

Authors

Journals

citations
Cited by 8 publications
(9 citation statements)
references
References 15 publications
0
9
0
Order By: Relevance
“…While this comparison between SPUMONI and minimap2 is close, we expect that as we are able to index and align to more human references simultaneously – for example, as more assemblies from the Human Pangenome Reference Consortium ( Human Pangenome Reference, 2021 ) and similar projects emerge — SPUMONI is well positioned for sublinear index growth and a greater throughput advantage. For instance, the r -index underlying SPUMONI was previously shown to be able to index up to 10 human genomes with sublinear growth in the index size ( Mun et al., 2020 ).…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…While this comparison between SPUMONI and minimap2 is close, we expect that as we are able to index and align to more human references simultaneously – for example, as more assemblies from the Human Pangenome Reference Consortium ( Human Pangenome Reference, 2021 ) and similar projects emerge — SPUMONI is well positioned for sublinear index growth and a greater throughput advantage. For instance, the r -index underlying SPUMONI was previously shown to be able to index up to 10 human genomes with sublinear growth in the index size ( Mun et al., 2020 ).…”
Section: Resultsmentioning
confidence: 99%
“…Importantly, the space required by an r -index is proportional to the number of runs in the Burrows-Wheeler transform (BWT) of the reference genomes (defined as r ) rather than the total length of the reference genomes. When the collection is highly repetitive, r grows sublinearly and far more slowly than the total length ( Mun et al., 2020 ).…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…For example, in a typical metagenomics experiment, the exact strain or sub-strain of a microorganism is unknown prior to sequencing and therefore, for optimal targeted sequencing, all strains and substrains need to be incorporated into the reference for identification. SPUMONI takes advantage of the overall repetitiveness of these references by building an r -index [5], and using the MONI algorithm to calculate matching statistics (MSs) [6].…”
Section: Introductionmentioning
confidence: 99%
“…Importantly, the space required by an r -index is proportional to the number of runs in the Burrows-Wheeler Transform of the reference genomes (defined as r ) rather than the total length of the reference genomes. When the collection is highly repetitive, r grows sublinearly, and far more slowly than the total length [5].…”
Section: Introductionmentioning
confidence: 99%