The flowering plant Arabidopsis thaliana is a dicot model organism for research in many aspects of plant biology. A comprehensive annotation of its genome paves the way for understanding the functions and activities of all types of transcripts, including mRNA, noncoding RNA, and small RNA. The most recent annotation update (TAIR10) released more than five years ago had a profound impact on Arabidopsis research. Maintaining the accuracy of the annotation continues to be a prerequisite for future progress. Using an integrative annotation pipeline, we assembled tissue-specific RNA-seq libraries from 113 datasets and constructed 48,359 transcript models of protein-coding genes in eleven tissues. In addition, we annotated various classes of noncoding RNA including small RNA, long intergenic RNA, small nucleolar RNA, natural antisense transcript, small nuclear RNA, and microRNA using published datasets and in-house analytic results. Altogether, we identified 738 novel protein-coding genes, 508 novel transcribed regions, 5051 non-coding genes, and 35846 small-RNA loci that formerly eluded annotation. Analysis on the splicing events and RNA-seq based expression profile revealed the landscapes of gene structures, untranslated regions, and splicing activities to be more intricate than previously appreciated. Furthermore, we present 692 uniformly expressed housekeeping genes, 43% of whose human orthologs are also housekeeping genes. This updated Arabidopsis genome annotation with a substantially increased resolution of gene models will not only further our understanding of the biological processes of this plant model but also of other species.. The literature since TAIR10 reveals a growing amount of information about noncoding RNA, including long intergenic RNA, natural antisense transcript, small RNA, microRNA, small nuclear RNA, small nucleolar RNA and tRNA (Sherstnev et al. CC-BY-NC-ND4.0 International license peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was not . http://dx.doi.org/10.1101/047308 doi: bioRxiv preprint first posted online Apr. 5,
BackgroundMedicago truncatula, a close relative of alfalfa, is a preeminent model for studying nitrogen fixation, symbiosis, and legume genomics. The Medicago sequencing project began in 2003 with the goal to decipher sequences originated from the euchromatic portion of the genome. The initial sequencing approach was based on a BAC tiling path, culminating in a BAC-based assembly (Mt3.5) as well as an in-depth analysis of the genome published in 2011.ResultsHere we describe a further improved and refined version of the M. truncatula genome (Mt4.0) based on de novo whole genome shotgun assembly of a majority of Illumina and 454 reads using ALLPATHS-LG. The ALLPATHS-LG scaffolds were anchored onto the pseudomolecules on the basis of alignments to both the optical map and the genotyping-by-sequencing (GBS) map. The Mt4.0 pseudomolecules encompass ~360 Mb of actual sequences spanning 390 Mb of which ~330 Mb align perfectly with the optical map, presenting a drastic improvement over the BAC-based Mt3.5 which only contained 70% sequences (~250 Mb) of the current version. Most of the sequences and genes that previously resided on the unanchored portion of Mt3.5 have now been incorporated into the Mt4.0 pseudomolecules, with the exception of ~28 Mb of unplaced sequences. With regard to gene annotation, the genome has been re-annotated through our gene prediction pipeline, which integrates EST, RNA-seq, protein and gene prediction evidences. A total of 50,894 genes (31,661 high confidence and 19,233 low confidence) are included in Mt4.0 which overlapped with ~82% of the gene loci annotated in Mt3.5. Of the remaining genes, 14% of the Mt3.5 genes have been deprecated to an “unsupported” status and 4% are absent from the Mt4.0 predictions.ConclusionsMt4.0 and its associated resources, such as genome browsers, BLAST-able datasets and gene information pages, can be found on the JCVI Medicago web site (http://www.jcvi.org/medicago). The assembly and annotation has been deposited in GenBank (BioProject: PRJNA10791). The heavily curated chromosomal sequences and associated gene models of Medicago will serve as a better reference for legume biology and comparative genomics.
Sugarcane (Saccharum spp.) is a major crop for sugar and bioenergy production. Its highly polyploid, aneuploid, heterozygous, and interspecific genome poses major challenges for producing a reference sequence. We exploited colinearity with sorghum to produce a BAC-based monoploid genome sequence of sugarcane. A minimum tiling path of 4660 sugarcane BAC that best covers the gene-rich part of the sorghum genome was selected based on whole-genome profiling, sequenced, and assembled in a 382-Mb single tiling path of a high-quality sequence. A total of 25,316 protein-coding gene models are predicted, 17% of which display no colinearity with their sorghum orthologs. We show that the two species, S. officinarum and S. spontaneum, involved in modern cultivars differ by their transposable elements and by a few large chromosomal rearrangements, explaining their distinct genome size and distinct basic chromosome numbers while also suggesting that polyploidization arose in both lineages after their divergence.
The germ plasm is a specialized region of oocyte cytoplasm that contains determinants of germ cell fate. In Xenopus oocytes, the germ plasm is a part of the METRO region of mitochondrial cloud. It contains the germinal granules and a variety of coding and noncoding RNAs that include Xcat2, Xlsirts, Xdazl, DEADSouth, Xpat, Xwnt11, fatVg, B7/Fingers, C10/XFACS, and mitochondrial large and small rRNA. We analyzed the distribution of these 11 different RNAs within the various compartments of germ plasm during Xenopus oogenesis and development by using whole-mount electron microscopy in situ hybridization. Serial EM sections were used to reconstruct a three-dimensional image of germinal granule distribution within the METRO region of the cloud and the distribution of RNAs on the granules in oocytes and embryos. We found that, in the oocytes, the majority of RNAs were associated either with the precursor of germinal granules or with the germ plasm matrix. Only Xcat2, Xpat, and DEADSouth RNAs were associated with the mature germinal granules in oocytes, while only Xcat2 and Xpat were associated with germinal granules in embryos. However, Xcat2 was the only RNA that was consistently sequestered inside the germinal granules, while the others were located on the periphery. Xdazl, which functions in germ cell migration/formation, was detected on the matrix between granules. Later in development, Xcat2 mRNA was released from the germinal granules. This coincides with the timing of its translational derepression. These results demonstrate that there is a dynamic three-dimensional architecture to the germinal granules that changes during oogenesis and development. They also indicate that association of specific RNAs with the germinal granules is not a prerequisite for their serving a germ cell function; however, it may be related to their state of translational repression.
Worldwide genetic diversity in 200 individuals comprising 41 castor bean accessions was assessed using amplified fragment polymorphisms (AFLPs) and simple sequence repeats (SSRs). We found that, despite surveying five continents and 35 countries, genetic diversity in castor bean germplasm is relatively low (overall H e = 0.126 for AFLPs and 0.188 for SSRs) compared to estimates of genetic diversity in other plant species. Our data also show no geographic structuring of genotypes across continents or countries within continents. An assessment of the congruence between AFLP and SSRs indicates a low correlation (R 2 = 0.19) between the two data sets, but each marker class nonetheless shows similar patterns of low-genetic diversity and a lack of geographic structure. Our data do suggest that SSRs yield a higher percentage of polymorphic loci, higher heterozyosity and a greater range of genetic distances, and are therefore more informative than are AFLPs on a locus-by-locus basis. Based on comparisons with numerous other plant species, we suggest that the lower genetic variation in this worldwide collection may be due to one or more factors including: sampling strategies that have not captured the full extent of genetic variation in the species; artifactual variation due to long-term germplasm storage and seed regeneration; or intense selection followed by domestic cultivation of a limited number of castor bean genotypes, which are widely propagated for their horticultural and agro-economic value.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.