the 3,000 accessions can be subdivided into nine subpopulations, where most accessions from close subgroups could be associated to geographic origin 12. One critical piece of information missing from these analyses is the fact that single nucleotide polymorphisms (SNPs) and structural variations (SVs) present in subpopulation specific genomic regions have yet to be detected because the 3K-RG data set was only aligned to a single reference genome. Therefore, the next logical step, to capture and understand genetic variation pan-subpopulation-wide is to map the 3K-RG dataset to high-quality reference genomes that represent each of the subpopulations of cultivated Asian rice. At present, only a handful high-quality rice genomes for cultivated rice are publicly available 5,6,13,14 , thus, there is an immediate need for such a comprehensive resource to be created, which is the subject of this Data Descriptor. Here we present a reanalysis of the population structure analysis discussed above 12 and show that the 3K-RG dataset can be further subdivided into a total of 15 subpopulations. We then present the generation of 12 new and near-gap-free high-quality PacBio long-read reference genomes from representative accessions of the 12 subpopulations of cultivated Asian rice for which no high-quality reference genomes exist. All 12 genomes were assembled with more than 100x genome coverage PacBio long-read sequence data and then validated with Bionano optical maps 15. The number of contigs covering each of the twelve assemblies, excluding unplaced contigs, ranged from 15 (GOBOL SAIL (BALAM)::IRGC 26624-2) to 104 (IR 64). The contig N50 value for the 12-genome dataset ranged from 7.35 Mb to 31.91 Mb. When combined with 4 previously published genomes (i.e. Minghui 63 (MH 63), Zhenshan 97 (ZS 97) 13,14 , N 22 5 and the IRGSP RefSeq. 6), this 16-genome dataset can be used to represent the K = 15 population/admixture structure of cultivated Asian rice. Methods ethics statement. This work was approved by the University of Arizona (UA),
Gramene (http://www.gramene.org) is a knowledgebase for comparative functional analysis in major crops and model plant species. The current release, #54, includes over 1.7 million genes from 44 reference genomes, most of which were organized into 62,367 gene families through orthologous and paralogous gene classification, whole-genome alignments, and synteny. Additional gene annotations include ontology-based protein structure and function; genetic, epigenetic, and phenotypic diversity; and pathway associations. Gramene's Plant Reactome provides a knowledgebase of cellular-level plant pathway networks. Specifically, it uses curated rice reference pathways to derive pathway projections for an additional 66 species based on gene orthology, and facilitates display of gene expression, gene–gene interactions, and user-defined omics data in the context of these pathways. As a community portal, Gramene integrates best-of-class software and infrastructure components including the Ensembl genome browser, Reactome pathway browser, and Expression Atlas widgets, and undergoes periodic data and software upgrades. Via powerful, intuitive search interfaces, users can easily query across various portals and interactively analyze search results by clicking on diverse features such as genomic context, highly augmented gene trees, gene expression anatomograms, associated pathways, and external informatics resources. All data in Gramene are accessible through both visual and programmatic interfaces.
Understanding and exploiting genetic diversity is a key factor for the productive and stable production of rice. Utilizing 16 high-quality genomes that represent the subpopulation structure of Asian rice (O. sativa), plus the genomes of two close relatives (O. rufipogon and O. punctata), we built a pan-genome inversion index of 1,054 non-redundant inversions that span an average of ~ 14% of the O. sativa cv. Nipponbare reference genome sequence. Using this index we estimated an inversion rate of 1,100 inversions per million years in Asian rice, which is 37 to 73 times higher than previously estimated for plants. Detailed analyses of these inversions showed evidence of their effects on gene regulation, recombination rate, linkage disequilibrium and agronomic trait performance. Our study uncovers the prevalence and scale of large inversions (≥ 100 bp) across the pan-genome of Asian rice, and hints at their largely unexplored role in functional biology and crop performance.
Understanding and exploiting genetic diversity is a key factor for the productive and stable production of rice. Here, we utilize 73 high-quality genomes that encompass the subpopulation structure of Asian rice (Oryza sativa), plus the genomes of two wild relatives (O. rufipogon and O. punctata), to build a pan-genome inversion index of 1769 non-redundant inversions that span an average of ~29% of the O. sativa cv. Nipponbare reference genome sequence. Using this index, we estimate an inversion rate of ~700 inversions per million years in Asian rice, which is 16 to 50 times higher than previously estimated for plants. Detailed analyses of these inversions show evidence of their effects on gene expression, recombination rate, and linkage disequilibrium. Our study uncovers the prevalence and scale of large inversions (≥100 bp) across the pan-genome of Asian rice and hints at their largely unexplored role in functional biology and crop performance.
The S-domain subfamily of receptor-like kinases (SDRLKs) in plants is poorly characterized. Most members of this subfamily are currently assigned gene function based on the S-locus Receptor Kinase from Brassica that acts as the female determinant of self-incompatibility (SI). However, Brassica like SI mechanisms does not exist in most plants. Thus, automated Gene Ontology (GO) pipelines are not sufficient for functional annotation of SDRLK subfamily members and lead to erroneous association with the GO biological process of SI. Here, we show that manual bio-curation can help to correct and improve the gene annotations and association with relevant biological processes. Using publicly available genomic and transcriptome datasets, we conducted a detailed analysis of the expansion of the rice (Oryza sativa) SDRLK subfamily, the structure of individual genes and proteins, and their expression.The 144-member SDRLK family in rice consists of 82 receptor-like kinases (RLKs) (67 full-length, 15 truncated),12 receptor-like proteins, 14 SD kinases, 26 kinase-like and 10 GnK2 domain-containing kinases and RLKs. Except for nine genes, all other SDRLK family members are transcribed in rice, but they vary in their tissue-specific and stress-response expression profiles. Furthermore, 98 genes show differential expression under biotic stress and 98 genes show differential expression under abiotic stress conditions, but share 81 genes in common.Our analysis led to the identification of candidate genes likely to play important roles in plant development, pathogen resistance, and abiotic stress tolerance. We propose a nomenclature for 144 SDRLK gene family members based on gene/protein conserved structural features, gene expression profiles, and literature review. Our biocuration approach, rooted in the principles of findability, accessibility, interoperability and reusability, sets forth an example of how manual annotation of large-gene families can fill in the knowledge gap that exists due to the implementation of automated GO projections, thereby helping to improve the quality and contents of public databases.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.