2016
DOI: 10.1038/srep26314
|View full text |Cite
|
Sign up to set email alerts
|

Illumina MiSeq sequencing disfavours a sequence motif in the GFP reporter gene

Abstract: Green fluorescent protein (GFP) is one of the most used reporter genes. We have used next-generation sequencing (NGS) to analyse the genetic diversity of a recombinant influenza A virus that expresses GFP and found a remarkable coverage dip in the GFP coding sequence. This coverage dip was present when virus-derived RT-PCR product or the parental plasmid DNA was used as starting material for NGS and regardless of whether Nextera XT transposase or Covaris shearing was used for DNA fragmentation. Therefore, the … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
8
0

Year Published

2017
2017
2020
2020

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(8 citation statements)
references
References 40 publications
(83 reference statements)
0
8
0
Order By: Relevance
“…S1), but these associations could not be confirmed in the subsequent analysis of sequenced amplicons of these regions. This might be due to the regions showing sequence patterns known to cause sequencing biases such as palindromes, inverted repeats, long homopolymers or other repetitive regions 3537 , highlighting the importance of uniform sequencing data and the need to confirm positive results when using such highly sensitive bioinformatic methods as k -mer-based GWAS from short read sequencing data.…”
Section: Discussionmentioning
confidence: 99%
“…S1), but these associations could not be confirmed in the subsequent analysis of sequenced amplicons of these regions. This might be due to the regions showing sequence patterns known to cause sequencing biases such as palindromes, inverted repeats, long homopolymers or other repetitive regions 3537 , highlighting the importance of uniform sequencing data and the need to confirm positive results when using such highly sensitive bioinformatic methods as k -mer-based GWAS from short read sequencing data.…”
Section: Discussionmentioning
confidence: 99%
“…The motif ‘CCNGCC’ is known to potentially cause a steep drop in Illumina read coverage [ 14 ] and increase sequencing errors [ 13 ]. Mixed results were obtained with the effect of this motif on read coverage of the haplotypes.…”
Section: Resultsmentioning
confidence: 99%
“…Our assumption is that the short read coverage of the original DNA sequences is proportional to the amount of DNA loaded on the sequencing run. Variation in the read coverage of Illumina sequencing technology is known to be sensitive to GC content [ 11 12 ] as well as to specific motifs [ 13 14 ]. We expect coverage of the entire mixture to vary along the length of the amplified fragments.…”
Section: Introductionmentioning
confidence: 99%
“…Independent of a PCR amplification bias, this could be related to the sequencing technology itself. Indeed, MiSeq sequencing that used the same four-channel sequencing chemistry than HiSeq has been shown to disfavor the "CCNGCC" motif in the GFP coding sequence (Van den Hoecke et al, 2016). On the other hand, sequencing technologies such as single molecule real-time (SMRT) sequencing (Pacific Biosciences) is described as giving a less biased coverage across GC-rich regions (Ross et al, 2013).…”
Section: Discussionmentioning
confidence: 99%