Rice (Oryza sativa), a major staple throughout the world and a model system for plant genomics and breeding, was the first crop genome sequenced almost two decades ago. However, reference genomes for all higher organisms to date contain gaps and missing sequences. Here, we report the assembly and analysis of gap-free reference genome sequences for two elite O. sativa xian/indica rice varieties, Zhenshan 97 and Minghui 63, which are being used as a model system for studying heterosis and yield. Gapfree reference genomes provide the opportunity for a global view of the structure and function of centromeres. We show that all rice centromeric regions share conserved centromere-specific satellite motifs with different copy numbers and structures. In addition, the similarity of CentO repeats in the same chromosome is higher than across chromosomes, supporting a model of local expansion and homogenization. Both genomes have over 395 non-TE genes located in centromere regions, of which $41% are actively transcribed. Two large structural variants at the end of chromosome 11 affect the copy number of resistance genes between the two genomes. The availability of the two gap-free genomes lays a solid foundation for further understanding genome structure and function in plants and breeding climate-resilient varieties.
Cotton is an agriculturally important crop. Because of its importance, a genome sequence of a diploid cotton species (Gossypium raimondii, D-genome) was first assembled using Sanger sequencing data in 2012. Improvements to DNA sequencing technology have improved accuracy and correctness of assembled genome sequences. Here we report a new de novo genome assembly of G. raimondii and its close relative G. turneri. The two genomes were assembled to a chromosome level using PacBio long-read technology, HiC, and Bionano optical mapping. This report corrects some minor assembly errors found in the Sanger assembly of G. raimondii. We also compare the genome sequences of these two species for gene composition, repetitive element composition, and collinearity. Most of the identified structural rearrangements between these two species are due to intra-chromosomal inversions. More inversions were found in the G. turneri genome sequence than the G. raimondii genome sequence. These findings and updates to the D-genome sequence will improve accuracy and translation of genomics to cotton breeding and genetics.
One of the extraordinary aspects of plant genome evolution is variation in chromosome number, particularly that among closely related species. This is exemplified by the cotton genus (Gossypium) and its relatives, where most species and genera have a base chromosome number of 13. The two exceptions are sister genera that have n = 12 (the Hawaiian Kokia and the East African and Madagascan Gossypioides). We generated a high-quality genome sequence of Gossypioides kirkii (n = 12) using PacBio, Bionano, and Hi-C technologies, and compared this assembly to genome sequences of Kokia (n = 12) and Gossypium diploids (n = 13). Previous analysis demonstrated that the directionality of their reduced chromosome number was through large structural rearrangements. A series of structural rearrangements were identified comparing the de novo G. kirkii genome sequence to genome sequences of Gossypium, including chromosome fusions and inversions. Genome comparison between G. kirkii and Gossypium suggests that multiple steps are required to generate the extant structural differences.
Chromosomal structural variations (SV) including insertions, deletions, inversions, and translocations occur within the genome and can have a significant effect on organismal phenotype. Some of these effects are caused by structural variations containing genes. Large structural variations represent a significant amount of the genetic diversity within a population. We used a global sampling of Drosophila melanogaster (Ithaca, Zimbabwe, Beijing, Tasmania, and Netherlands) to represent diverse populations within the species. We used long-read sequencing and optical mapping technologies to identify SVs in these genomes. Among the five lines examined, we found an average of 2,928 structural variants within these genomes. These structural variations varied greatly in size and location, included many exonic regions, and could impact adaptation and genomic evolution.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.