Many plant species of great economic value (e.g., potato, wheat, cotton, and sugarcane) are polyploids. Despite the essential roles of autopolyploid plants in human activities, our genetic understanding of these species is still poor. Recent progress in instrumentation and biochemical manipulation has led to the accumulation of an incredible amount of genomic data. In this study, we demonstrate for the first time a successful genetic analysis in a highly polyploid genome (sugarcane) by the quantitative analysis of single-nucleotide polymorphism (SNP) allelic dosage and the application of a new data analysis framework. This study provides a better understanding of autopolyploid genomic structure and is a sound basis for genetic studies. The proposed methods can be employed to analyse the genome of any autopolyploid and will permit the future development of high-quality genetic maps to assist in the assembly of reference genome sequences for polyploid species.
Sugarcane is an important crop and a major source of sugar and alcohol. In this study, we performed de novo assembly and transcriptome annotation for six sugarcane genotypes involved in bi-parental crosses. The de novo assembly of the sugarcane transcriptome was performed using short reads generated using the Illumina RNA-Seq platform. We produced more than 400 million reads, which were assembled into 72,269 unigenes. Based on a similarity search, the unigenes showed significant similarity to more than 28,788 sorghum proteins, including a set of 5,272 unigenes that are not present in the public sugarcane EST databases; many of these unigenes are likely putative undescribed sugarcane genes. From this collection of unigenes, a large number of molecular markers were identified, including 5,106 simple sequence repeats (SSRs) and 708,125 single-nucleotide polymorphisms (SNPs). This new dataset will be a useful resource for future genetic and genomic studies in this species.
BackgroundSugarcane is the source of sugar in all tropical and subtropical countries and is becoming increasingly important for bio-based fuels. However, its large (10 Gb), polyploid, complex genome has hindered genome based breeding efforts. Here we release the largest and most diverse set of sugarcane genome sequences to date, as part of an on-going initiative to provide a sugarcane genomic information resource, with the ultimate goal of producing a gold standard genome.ResultsThree hundred and seventeen chiefly euchromatic BACs were sequenced. A reference set of one thousand four hundred manually-annotated protein-coding genes was generated. A small RNA collection and a RNA-seq library were used to explore expression patterns and the sRNA landscape. In the sucrose and starch metabolism pathway, 16 non-redundant enzyme-encoding genes were identified. One of the sucrose pathway genes, sucrose-6-phosphate phosphohydrolase, is duplicated in sugarcane and sorghum, but not in rice and maize. A diversity analysis of the s6pp duplication region revealed haplotype-structured sequence composition. Examination of hom(e)ologous loci indicate both sequence structural and sRNA landscape variation. A synteny analysis shows that the sugarcane genome has expanded relative to the sorghum genome, largely due to the presence of transposable elements and uncharacterized intergenic and intronic sequences.ConclusionThis release of sugarcane genomic sequences will advance our understanding of sugarcane genetics and contribute to the development of molecular tools for breeding purposes and gene discovery.Electronic supplementary materialThe online version of this article (doi:10.1186/1471-2164-15-540) contains supplementary material, which is available to authorized users.
Background Sugarcane cultivars are polyploid interspecific hybrids of giant genomes, typically with 10–13 sets of chromosomes from 2 Saccharum species. The ploidy, hybridity, and size of the genome, estimated to have >10 Gb, pose a challenge for sequencing. Results Here we present a gene space assembly of SP80-3280, including 373,869 putative genes and their potential regulatory regions. The alignment of single-copy genes in diploid grasses to the putative genes indicates that we could resolve 2–6 (up to 15) putative homo(eo)logs that are 99.1% identical within their coding sequences. Dissimilarities increase in their regulatory regions, and gene promoter analysis shows differences in regulatory elements within gene families that are expressed in a species-specific manner. We exemplify these differences for sucrose synthase (SuSy) and phenylalanine ammonia-lyase (PAL), 2 gene families central to carbon partitioning. SP80-3280 has particular regulatory elements involved in sucrose synthesis not found in the ancestor Saccharum spontaneum. PAL regulatory elements are found in co-expressed genes related to fiber synthesis within gene networks defined during plant growth and maturation. Comparison with sorghum reveals predominantly bi-allelic variations in sugarcane, consistent with the formation of 2 “subgenomes” after their divergence ∼3.8–4.6 million years ago and reveals single-nucleotide variants that may underlie their differences. Conclusions This assembly represents a large step towards a whole-genome assembly of a commercial sugarcane cultivar. It includes a rich diversity of genes and homo(eo)logous resolution for a representative fraction of the gene space, relevant to improve biomass and food production.
BackgroundSugarcane (Saccharum spp.) is predominantly an autopolyploid plant with a variable ploidy level, frequent aneuploidy and a large genome that hampers investigation of its organization. Genetic architecture studies are important for identifying genomic regions associated with traits of interest. However, due to the genetic complexity of sugarcane, the practical applications of genomic tools have been notably delayed in this crop, in contrast to other crops that have already advanced to marker-assisted selection (MAS) and genomic selection. High-throughput next-generation sequencing (NGS) technologies have opened new opportunities for discovering molecular markers, especially single nucleotide polymorphisms (SNPs) and insertion-deletion (indels), at the genome-wide level. The objectives of this study were to (i) establish a pipeline for identifying variants from genotyping-by-sequencing (GBS) data in sugarcane, (ii) construct an integrated genetic map with GBS-based markers plus target region amplification polymorphisms and microsatellites, (iii) detect QTLs related to yield component traits, and (iv) perform annotation of the sequences that originated the associated markers with mapped QTLs to search putative candidate genes.ResultsWe used four pseudo-references to align the GBS reads. Depending on the reference, from 3,433 to 15,906 high-quality markers were discovered, and half of them segregated as single-dose markers (SDMs) on average. In addition to 7,049 non-redundant SDMs from GBS, 629 gel-based markers were used in a subsequent linkage analysis. Of 7,678 SDMs, 993 were mapped. These markers were distributed throughout 223 linkage groups, which were clustered in 18 homo(eo)logous groups (HGs), with a cumulative map length of 3,682.04 cM and an average marker density of 3.70 cM. We performed QTL mapping of four traits and found seven QTLs. Our results suggest the presence of a stable QTL across locations. Furthermore, QTLs to soluble solid content (BRIX) and fiber content (FIB) traits had markers linked to putative candidate genes.ConclusionsThis study is the first to report the use of GBS for large-scale variant discovery and genotyping of a mapping population in sugarcane, providing several insights regarding the use of NGS data in a polyploid, non-model species. The use of GBS generated a large number of markers and still enabled ploidy and allelic dosage estimation. Moreover, we were able to identify seven QTLs, two of which had great potential for validation and future use for molecular breeding in sugarcane.Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-016-3383-x) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.