We report an improved draft nucleotide sequence of the 2.3-gigabase genome of maize, an important crop plant and model for biological research. Over 32,000 genes were predicted, of which 99.8% were placed on reference chromosomes. Nearly 85% of the genome is composed of hundreds of families of transposable elements, dispersed nonuniformly across the genome. These were responsible for the capture and amplification of numerous gene fragments and affect the composition, sizes, and positions of centromeres. We also report on the correlation of methylation-poor regions with Mu transposon insertions and recombination, and copy number variants with insertions and/or deletions, as well as how uneven gene losses between duplicated regions were involved in returning an ancient allotetraploid to a genetically diploid state. These analyses inform and set the stage for further investigations to improve our understanding of the domestication and agricultural improvements of maize.
Complete and accurate reference genomes and annotations provide fundamental tools for characterization of genetic and functional variation 1 . These resources facilitate the determination of biological processes and support translation of research findings into improved and sustainable agricultural technologies. Many reference genomes for crop plants have been generated over the past decade, but these genomes are often fragmented and missing complex repeat regions 2 . Here we report the assembly and annotation of a reference genome of maize, a genetic and agricultural model species, using single-molecule real-time sequencing and high-resolution optical mapping. Relative to the previous reference genome 3 , our assembly features a 52-fold increase in contig length and notable improvements in the assembly of intergenic spaces and centromeres. Characterization of the repetitive portion of the genome revealed more than 130,000 intact transposable elements, allowing us to identify transposable element lineage expansions that are unique to maize. Gene annotations were updated using 111,000 full-length transcripts obtained by single-molecule real-time sequencing 4 . In addition, comparative optical mapping of two other inbred maize lines revealed a prevalence of deletions in regions of low gene density and maize lineage-specific genes.
We describe a comprehensive and general approach for mapping centromeres and present a detailed characterization of two maize centromeres. Centromeres are difficult to map and analyze because they consist primarily of repetitive DNA sequences, which in maize are the tandem satellite repeat CentC and interspersed centromeric retrotransposons of maize (CRM). Centromeres are defined epigenetically by the centromeric histone H3 variant, CENH3. Using novel markers derived from centromere repeats, we have mapped all ten centromeres onto the physical and genetic maps of maize. We were able to completely traverse centromeres 2 and 5, confirm physical maps by fluorescence in situ hybridization (FISH), and delineate their functional regions by chromatin immunoprecipitation (ChIP) with anti-CENH3 antibody followed by pyrosequencing. These two centromeres differ substantially in size, apparent CENH3 density, and arrangement of centromeric repeats; and they are larger than the rice centromeres characterized to date. Furthermore, centromere 5 consists of two distinct CENH3 domains that are separated by several megabases. Succession of centromere repeat classes is evidenced by the fact that elements belonging to the recently active recombinant subgroups of CRM1 colonize the present day centromeres, while elements of the ancestral subgroups are also found in the flanking regions. Using abundant CRM and non-CRM retrotransposons that inserted in and near these two centromeres to create a historical record of centromere location, we show that maize centromeres are fluid genomic regions whose borders are heavily influenced by the interplay of retrotransposons and epigenetic marks. Furthermore, we propose that CRMs may be involved in removal of centromeric DNA (specifically CentC), invasion of centromeres by non-CRM retrotransposons, and local repositioning of the CENH3.
Complete and accurate reference genomes and annotations provide fundamental tools for characterization of genetic and functional variation. These resources facilitate elucidation of biological processes and support translation of research findings into improved and sustainable agricultural technologies. Many reference genomes for crop plants have been generated over the past decade, but these genomes are often fragmented and missing complex repeat regions. Here, we report the assembly and annotation of maize, a genetic and agricultural model crop, using Single Molecule Real-Time (SMRT) sequencing and high-resolution genome map. Relative to the previous reference genome, our assembly features a 52-fold increase in contig length and significant improvements in the assembly of intergenic spaces and centromeres. Characterization of the repetitive portion of the genome revealed over 130,000 intact transposable elements (TEs), allowing us to identify TE lineage expansions unique to maize. Gene annotations were updated using 111,000 full-length transcripts obtained by SMRT sequencing. In addition, comparative optical mapping of two other inbreds revealed a prevalence of deletions in the region of low gene density region and maize lineage-specific genes.
Functional centromeres, the chromosomal sites of spindle attachment during cell division, are marked epigenetically by the centromerespecific histone H3 variant cenH3 and typically contain long stretches of centromere-specific tandem DNA repeats (∼1.8 Mb in maize). In 23 inbreds of domesticated maize chosen to represent the genetic diversity of maize germplasm, partial or nearly complete loss of the tandem DNA repeat CentC precedes 57 independent cenH3 relocation events that result in neocentromere formation. Chromosomal regions with newly acquired cenH3 are colonized by the centromere-specific retrotransposon CR2 at a rate that would result in centromere-sized CR2 clusters in 20,000-95,000 y. Three lines of evidence indicate that CentC loss is linked to inbreeding, including (i) CEN10 of temperate lineages, presumed to have experienced a genetic bottleneck, contain less CentC than their tropical relatives; (ii) strong selection for centromere-linked genes in domesticated maize reduced diversity at seven of the ten maize centromeres to only one or two postdomestication haplotypes; and (iii) the centromere with the largest number of haplotypes in domesticated maize (CEN7) has the highest CentC levels in nearly all domesticated lines. Rare recombinations introduced one (CEN2) or more (CEN5) alternate CEN haplotypes while retaining a single haplotype at domestication loci linked to these centromeres. Taken together, this evidence strongly suggests that inbreeding, favored by postdomestication selection for centromere-linked genes affecting key domestication or agricultural traits, drives replacement of the tandem centromere repeats in maize and other crop plants. Similar forces may act during speciation in natural systems.centromere drive | centromere paradox | founder effect | hemicentric inversion | linkage disequilibrium C entromere-specific tandemly arranged DNA repeats vary in length and nucleotide sequence between species. The puzzling observation that centromeres can consist of highly variable sequences despite being involved in an essential cellular function (i.e., chromosome segregation) has been coined the "centromere paradox" (1). "Centromere drive" has been proposed to preferentially segregate the "favored" centromere into the female gamete and thereby provide the selective force that acts on centromere DNA sequences and interacting proteins (2).Maize (Zea mays ssp. mays) was domesticated between 7.5 and 10 thousand years ago (ka) from wind-pollinated outcrossing wild teosinte (Z. mays ssp. parviglumis) (3, 4) in a process that dramatically changed its morphology. Several quantitative trait loci (QTLs) responsible for these morphological changes were identified in pioneering work (5-8), and a large number of additional genetic loci involved in maize domestication and improvement were subsequently identified in genome-wide scans (9). Gene (and centromere) flow between the fully interfertile maize and teosinte subspecies has been documented (10, 11). Functional centromeres of maize consist of 1-2 Mb of DN...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.