OBITUARY Heinrich Rohrer, pioneer of scanning tunnelling microscopy, remembered p.30 GENES US Supreme Court patent rulings set a higher bar for innovation p.29 ART Exhibition revels in the power of unconstrained thought p.28 SPACE An elegy for the disappearing dark, banished by science p.26 Feeding the future We must mine the biodiversity in seed banks to help to overcome food shortages, urge Susan McCouch and colleagues. The International Center for Tropical Agriculture in Colombia holds 65,000 crop samples from 141 countries.
BackgroundRapid development of highly saturated genetic maps aids molecular breeding, which can accelerate gain per breeding cycle in woody perennial plants such as Rubus idaeus (red raspberry). Recently, robust genotyping methods based on high-throughput sequencing were developed, which provide high marker density, but result in some genotype errors and a large number of missing genotype values. Imputation can reduce the number of missing values and can correct genotyping errors, but current methods of imputation require a reference genome and thus are not an option for most species.ResultsGenotyping by Sequencing (GBS) was used to produce highly saturated maps for a R. idaeus pseudo-testcross progeny. While low coverage and high variance in sequencing resulted in a large number of missing values for some individuals, a novel method of imputation based on maximum likelihood marker ordering from initial marker segregation overcame the challenge of missing values, and made map construction computationally tractable. The two resulting parental maps contained 4521 and 2391 molecular markers spanning 462.7 and 376.6 cM respectively over seven linkage groups. Detection of precise genomic regions with segregation distortion was possible because of map saturation. Microsatellites (SSRs) linked these results to published maps for cross-validation and map comparison.ConclusionsGBS together with genome-independent imputation provides a rapid method for genetic map construction in any pseudo-testcross progeny. Our method of imputation estimates the correct genotype call of missing values and corrects genotyping errors that lead to inflated map size and reduced precision in marker placement. Comparison of SSRs to published R. idaeus maps showed that the linkage maps constructed with GBS and our method of imputation were robust, and marker positioning reliable. The high marker density allowed identification of genomic regions with segregation distortion in R. idaeus, which may help to identify deleterious alleles that are the basis of inbreeding depression in the species.
BackgroundSecond generation sequencing has permitted detailed sequence characterisation at the whole genome level of a growing number of non-model organisms, but the data produced have short read-lengths and biased genome coverage leading to fragmented genome assemblies. The PacBio RS long-read sequencing platform offers the promise of increased read length and unbiased genome coverage and thus the potential to produce genome sequence data of a finished quality containing fewer gaps and longer contigs. However, these advantages come at a much greater cost per nucleotide and with a perceived increase in error-rate. In this investigation, we evaluated the performance of the PacBio RS sequencing platform through the sequencing and de novo assembly of the Potentilla micrantha chloroplast genome.ResultsFollowing error-correction, a total of 28,638 PacBio RS reads were recovered with a mean read length of 1,902 bp totalling 54,492,250 nucleotides and representing an average depth of coverage of 320× the chloroplast genome. The dataset covered the entire 154,959 bp of the chloroplast genome in a single contig (100% coverage) compared to seven contigs (90.59% coverage) recovered from an Illumina data, and revealed no bias in coverage of GC rich regions. Post-assembly the data were largely concordant with the Illumina data generated and allowed 187 ambiguities in the Illumina data to be resolved. The additional read length also permitted small differences in the two inverted repeat regions to be assigned unambiguously.ConclusionsThis is the first report to our knowledge of a chloroplast genome assembled de novo using PacBio sequence data. The PacBio RS data generated here were assembled into a single large contig spanning the P. micrantha chloroplast genome, with a higher degree of accuracy than an Illumina dataset generated at a much greater depth of coverage, due to longer read lengths and lower GC bias in the data. The results we present suggest PacBio data will be of immense utility for the development of genome sequence assemblies containing fewer unresolved gaps and ambiguities and a significantly smaller number of contigs than could be produced using short-read sequence data alone.
Even with recent reductions in sequencing costs, most plants lack the genomic resources required for successful short-read transcriptome analyses as performed routinely in model species. Several approaches for the analysis of short-read transcriptome data are reviewed for nonmodel species for which the genome of a close relative is used as the reference genome. Two approaches using a data set from Phytophthora-challenged Rubus idaeus (red raspberry) are compared. Over 70000000 86-nt Illumina reads derived from R. idaeus roots were aligned to the Fragaria vesca genome using publicly available informatics tools (Bowtie/TopHat and Cufflinks). Alignment identified 16956 putatively expressed genes. De novo assembly was performed with the same data set and a publicly available transcriptome assembler (Trinity). A BLAST search with a maximum e-value threshold of 1.0 × 10(-3) revealed that over 36000 transcripts had matches to plants and over 500 to Phytophthora. Gene expression estimates from alignment to F. vesca and de novo assembly were compared for raspberry (Pearson's correlation = 0.730). Together, alignment to the genome of a close relative and de novo assembly constitute a powerful method of transcriptome analysis in nonmodel organisms. Alignment to the genome of a close relative provides a framework for differential expression testing if alignments are made to the predefined gene-space of a close relative and de novo assembly provides a more robust method of identifying unique sequences and sequences from other organisms in a system. These methods are considered experimental in nonmodel systems, but can be used to generate resources and specific testable hypotheses.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.