2008
DOI: 10.1038/nmeth.1179
|View full text |Cite
|
Sign up to set email alerts
|

Whole-genome sequencing and variant discovery in C. elegans

Abstract: Massively parallel sequencing instruments enable rapid and inexpensive DNA sequence data production. Because these instruments are new, their data require characterization with respect to accuracy and utility. To address this, we sequenced a Caernohabditis elegans N2 Bristol strain isolate using the Solexa Sequence Analyzer, and compared the reads to the reference genome to characterize the data and to evaluate coverage and representation. Massively parallel sequencing facilitates strain-to-reference compariso… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

17
305
0

Year Published

2009
2009
2018
2018

Publication Types

Select...
6
4

Relationship

0
10

Authors

Journals

citations
Cited by 373 publications
(322 citation statements)
references
References 16 publications
17
305
0
Order By: Relevance
“…This 44% reduction in polymorphism in the inbred genome is smaller than the 59.4% predicted from four generations of brother-sister mating, indicating that selection favouring heterozygotes had occurred 19 . The polymorphism combining inbred and wild (among four haplotypes) was 2.3%, higher than that in most studied animal genomes 20,21 but comparable to that in known high-polymorphism species 7 . In inbred and wild, we found 3,094 short indels located in coding regions inferred to cause frameshift variants in 2,665 genes, providing an important source for recessive lethal mutations.…”
Section: Polymorphism and Repetitive Sequencesmentioning
confidence: 60%
“…This 44% reduction in polymorphism in the inbred genome is smaller than the 59.4% predicted from four generations of brother-sister mating, indicating that selection favouring heterozygotes had occurred 19 . The polymorphism combining inbred and wild (among four haplotypes) was 2.3%, higher than that in most studied animal genomes 20,21 but comparable to that in known high-polymorphism species 7 . In inbred and wild, we found 3,094 short indels located in coding regions inferred to cause frameshift variants in 2,665 genes, providing an important source for recessive lethal mutations.…”
Section: Polymorphism and Repetitive Sequencesmentioning
confidence: 60%
“…We use MOSAIK (http://bioinformatics.bc.edu/marthlab/Mosaik) to map the reads to the human reference genome (hg19) with parameters -hs 15 -p 12 -mmp 0.05 -act 26 - mhp 100 -bw 51 as recommended in its documentation. MOSAIK is a widely used reference-guided assembler that hashes the whole reference genome and locate information in the hash table using a 'jump database' [19-21]. Then we use SAMtools (http://samtools.sourceforge.net/) [22] to pileup the reads onto the target regions.…”
Section: Methodsmentioning
confidence: 99%
“…New theoretical work is also aiming to statistically distinguishing true polymorphisms from sequencing errors (see, for example, Lynch, 2009). GigaBayes (Hillier et al, 2008) and VarScan (Koboldt et al, 2009) represent examples of programs for SNP detection. GigaBayes calculates the probability that a polymorphism represent a true SNP or a sequencing error, for this calculation the program uses a Bayesian approach, taking into account the alignment depth, the base call in each sequence, the base composition in the region and the expected a priori polymorphism rate.…”
Section: Large-scale Identification and Development Of Molecular Markersmentioning
confidence: 99%