Fast and sensitive mapping of nanopore sequencing reads with GraphMap

Sović, Ivan; Šikić, Mile; Wilm, Andreas; Fenlon, Shannon N.; Chen, Swaine L.; Nagarajan, Niranjan

doi:10.1038/ncomms11307

Cited by 353 publications

(325 citation statements)

References 37 publications

Supporting

Mentioning

322

Contrasting

Unclassified

Order By: Relevance

“…Genomic DNA of CEN.PK113-7D Delft and CEN.PK113-7D Frankfurt for WGS was isolated using the Qiagen 100/G kit (Qiagen, Hilden, Germany) according to the manufacturer's instructions and quantified using a Qubit R Fluorometer 2.0 (ThermoFisher Scientific, Waltham, MA, USA). (Sović et al 2016) and calculating mismatches based on the CIGAR strings of reads with a mapping quality of at least 1 and no more than 500 nt of soft/hard clipping on each end of the alignment to avoid erroneous read alignments due to repetitive regions (i.e. paralogous genes, genes with copy number variation).…”

Section: Yeast Cultivation and Genomic Dna Extractionmentioning

confidence: 99%

“…Raw nanopore reads were filtered for lambda DNA by aligning to the Enterobacteria phage lambda reference genome (RefSeq assembly accession: GCF 000840245.1) using Graphmap (Sović et al 2016) with -no-end2end parameter and retaining only unmappeds reads using Samtools (Li et al 2009). All reads obtained from the Delft and the Frankfurt CEN.PK113-7D stock cultures were assembled de novo using Canu (version 1.3) (Koren et al 2017) with -genomesize set to 12 Mbp.…”

Section: De Novo Genome Assemblymentioning

confidence: 99%

See 1 more Smart Citation

Nanopore sequencing enables near-completede novoassembly ofSaccharomyces cerevisiaereference strain CEN.PK113-7D

Salazar

Vries

Broek

et al. 2017

Preprint

View full text Add to dashboard Cite

The haploid Saccharomyces cerevisiae strain CEN.PK113-7D is a popular model system for metabolic engineering and systems biology research. Current genome assemblies are based on short-read sequencing data scaffolded based on homology to strain S288C. However, these assemblies contain large sequence gaps, particularly in subtelomeric regions, and the assumption of perfect homology to S288C for scaffolding introduces bias. In this study, we obtained a near-complete genome assembly of CEN.PK113-7D using only Oxford Nanopore Technology's MinION sequencing platform. Fifteen of the 16 chromosomes, the mitochondrial genome and the 2-μm plasmid are assembled in single contigs and all but one chromosome starts or ends in a telomere repeat. This improved genome assembly contains 770 Kbp of added sequence containing 248 gene annotations in comparison to the previous assembly of CEN.PK113-7D. Many of these genes encode functions determining fitness in specific growth conditions and are therefore highly relevant for various industrial applications. Furthermore, we discovered a translocation between chromosomes III and VIII that caused misidentification of a MAL locus in the previous CEN.PK113-7D assembly. This study demonstrates the power of long-read sequencing by providing a high-quality reference assembly and annotation of CEN.PK113-7D and places a caveat on assumed genome stability of microorganisms.

show abstract

Section: Yeast Cultivation and Genomic Dna Extractionmentioning

confidence: 99%

Section: De Novo Genome Assemblymentioning

confidence: 99%

Nanopore sequencing enables near-completede novoassembly ofSaccharomyces cerevisiaereference strain CEN.PK113-7D

Salazar

Vries

Broek

et al. 2017

Preprint

View full text Add to dashboard Cite

show abstract

“…4) was performed with Graphmap [10] and the BurrowsWheeler Aligner with the option -x ont2d (BWA-MEM) [11]. The former was found to align reads faster and with fewer mismatches (Fig.…”

Section: Alignmentmentioning

confidence: 99%

Beyond Homopolymer Errors: a Systematic Investigation of Nanopore-based DNA Sequencing Characteristics Using HLA-DQA2

Sarkozy

Molnár

Fogl

et al. 2017

Period. Polytech. Elec. Eng. Comp. Sci.

View full text Add to dashboard Cite

“…As a consequence, multiple programs were developed to efficiently overlap long noisy reads, such as BLASR (Chin et al, 2013), DALIGNER (Myers, 2014), MHAP (Berlin et al, 2015), GraphMap (Sović et al, 2016), and Minimap (Li, 2016). These all search for shared seeds between reads, but differ in the way these seeds are found and thereafter used to determine overlap candidates.…”

Section: Introductionmentioning

confidence: 99%

Fast and memory-efficient noisy read overlapping with KD-trees

Parkhomchuk

Bremges

McHardy

2017

Preprint

View full text Add to dashboard Cite

Motivation: Third-generation sequencing technologies produce long, but noisy reads with increasing sequencing throughput and decreasing per-base costs. Detecting read-to-read overlaps in such data is the most computationally intensive step in de novo assembly. Recently, efficient algorithms were developed for this task; nearly all of these utilize long k-mers (>10 bp) to compare reads, but vary in their approaches to indexing, hashing, filtering, and dimensionality reduction. Results:We describe an algorithm for efficient overlap detection that directly compares the full spectrum of short k-mers, namely tetramers, through geometric embedding and approximate nearest neighbor search in multidimensional KD-trees. A proof of concept implementation detected read-toread overlaps in bacterial PacBio and ONT datasets with notably lower memory consumption than state-of-the-art approaches and allowed downstream de novo assembly into single contigs. We also introduce a sequence-context dependent tagging scheme that contributes to memory and computational efficiency and could be used with other aligning and overlapping algorithms.

show abstract

Fast and sensitive mapping of nanopore sequencing reads with GraphMap

Cited by 353 publications

References 37 publications

Nanopore sequencing enables near-completede novoassembly ofSaccharomyces cerevisiaereference strain CEN.PK113-7D

Nanopore sequencing enables near-completede novoassembly ofSaccharomyces cerevisiaereference strain CEN.PK113-7D

Beyond Homopolymer Errors: a Systematic Investigation of Nanopore-based DNA Sequencing Characteristics Using HLA-DQA2

Fast and memory-efficient noisy read overlapping with KD-trees

Contact Info

Product

Resources

About