2014
DOI: 10.1186/preaccept-6768001251451949
|View full text |Cite
|
Sign up to set email alerts
|

BAsE-Seq: a method for obtaining long viral haplotypes from short sequence reads

Abstract: We present a method for obtaining long haplotypes, of over 3 kb in length, using a short-read sequencer, Barcode-directed Assembly for Extra-long Sequences (BAsE-Seq). BAsE-Seq relies on transposing a template-specific barcode onto random segments of the template molecule and assembling the barcoded short reads into complete haplotypes. We applied BAsE-Seq on mixed clones of hepatitis B virus and accurately identified haplotypes occurring at frequencies greater than or equal to 0.4%, with >99.9% specificity. A… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
30
0

Year Published

2015
2015
2023
2023

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 30 publications
(30 citation statements)
references
References 57 publications
0
30
0
Order By: Relevance
“…Longer, more complicated pathways await development of sequencing technologies able to map long DNA stretches with unprecedented accuracy. 47,48 We anticipate such improvements in deep sequencing capability, allowing the use of FluxScan for resolving haplotypes of coupled mutations spread over complete synthetic operons. Reagents, plasmid construction and verification, comprehensive single-site mutagenesis library preparation, growth selections, deep sequencing analysis, biochemical characterization, growth rate and lysate flux measurements of clonal variants, crystallization and structure determination, computational design, accession codes, determination of transport limitations, error approximation for the fitness metric, full DNA and nucleic acid sequences for designs, python scripts, full heatmaps for both selections, fitness metric reproducibility of second selection, distribution of mutations in LGK.9, unselected library frequency distributions, measured versus theoretical flux determination, read coverage statistics, point mutant biochemical properties, specific growth rates of cultures expressing LGK versus LGK.1, fraction ASA buried of improved mutations, LGK design mutations, backcross activities of designs, gene tile PCR primers, and crystallographic statistics.…”
Section: ■ Results and Discussionmentioning
confidence: 99%
“…Longer, more complicated pathways await development of sequencing technologies able to map long DNA stretches with unprecedented accuracy. 47,48 We anticipate such improvements in deep sequencing capability, allowing the use of FluxScan for resolving haplotypes of coupled mutations spread over complete synthetic operons. Reagents, plasmid construction and verification, comprehensive single-site mutagenesis library preparation, growth selections, deep sequencing analysis, biochemical characterization, growth rate and lysate flux measurements of clonal variants, crystallization and structure determination, computational design, accession codes, determination of transport limitations, error approximation for the fitness metric, full DNA and nucleic acid sequences for designs, python scripts, full heatmaps for both selections, fitness metric reproducibility of second selection, distribution of mutations in LGK.9, unselected library frequency distributions, measured versus theoretical flux determination, read coverage statistics, point mutant biochemical properties, specific growth rates of cultures expressing LGK versus LGK.1, fraction ASA buried of improved mutations, LGK design mutations, backcross activities of designs, gene tile PCR primers, and crystallographic statistics.…”
Section: ■ Results and Discussionmentioning
confidence: 99%
“…Although there are computational tools 124 to help resolve these issues, new technologies can generate longer reads. Newer, single-molecule sequencers, such as PacBio (Pacific Biosciences) and MinION (Oxford Nanopore), are capable of extremely long-read sequencing, and whole viral genomes (for example, viruses that have genomes less than 20 kb in size, such as Ebola virus, norovirus and influenza A virus) could theoretically be obtained from and selective depletion of DNA with a certain methylation pattern), no similar methods exist, so far, for viral sequencing.…”
Section: Financial Barriers To the Clinical Use Of Viral Wgsmentioning
confidence: 99%
“…In general, the inference of haplotype frequencies using variant frequencies from short sequencing reads for a microbial population undergoing recombination is a complex problem 15 . However, as we are not attempting to comment on abundances of subpopulations but only their presence, we did not need to infer haplotype frequencies.…”
Section: Variant Identification and Clustering Analysismentioning
confidence: 99%