2013
DOI: 10.1093/bioinformatics/btt215
|View full text |Cite
|
Sign up to set email alerts
|

Short read alignment with populations of genomes

Abstract: Summary: The increasing availability of high-throughput sequencing technologies has led to thousands of human genomes having been sequenced in the past years. Efforts such as the 1000 Genomes Project further add to the availability of human genome variation data. However, to date, there is no method that can map reads of a newly sequenced human genome to a large collection of genomes. Instead, methods rely on aligning reads to a single reference genome. This leads to inherent biases and lower accuracy. To tack… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
105
0

Year Published

2014
2014
2019
2019

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 113 publications
(105 citation statements)
references
References 33 publications
0
105
0
Order By: Relevance
“…Nonredundant search of a population of human genomes was recently shown to improve detection sensitivity using a new data structure to map reads to multiple human genomes simultaneously (Huang et al 2013). This approach used assembled human genomes for reference but excluded microbial genomes.…”
mentioning
confidence: 99%
“…Nonredundant search of a population of human genomes was recently shown to improve detection sensitivity using a new data structure to map reads to multiple human genomes simultaneously (Huang et al 2013). This approach used assembled human genomes for reference but excluded microbial genomes.…”
mentioning
confidence: 99%
“…Similar idea was also adopted in [13] for larger indel calling, but replacing the k-mer indexing of [12] with BWT-index built for the extracted context around putative indels and their nearby combinations. Recently, this context extraction idea was also proposed for pan-genome indexing [14]. Our approach is more generic than the k-mer indexing and context extracting approaches above in that the read length is not fixed.…”
Section: Related Workmentioning
confidence: 99%
“…We also compared GCSA to BWBBLE [14], a recent BWTbased read aligner for pan-genomes. Given an upper bound for read length, BWBBLE creates a new sequence for each known indel, with an amount of context before and after the indel depending on the upper bound.…”
Section: Pattern Matchingmentioning
confidence: 99%
“…A major drawback of the hash-based aligners is that they require prohibitive amount of memory (see Note 3). The second generation BWT-based aligners are preferred as they consume only a limited amount of memory [38,39].…”
Section: Read Alignment To a Reference Genomementioning
confidence: 99%