It is likely that many small ORFs (sORFs; 30-100 amino acids) are missed when genomes are annotated. To overcome this limitation, we identified ∼8,000 sORFs with high coding potential in intergenic regions of the Arabidopsis thaliana genome. However, the question remains as to whether these coding sORFs play functional roles. Using a designed array, we generated an expression atlas for 16 organs and 17 environmental conditions among 7,901 identified coding sORFs. A total of 2,099 coding sORFs were highly expressed under at least one experimental condition, and 571 were significantly conserved in other land plants. A total of 473 coding sORFs were overexpressed; ∼10% (49/473) induced visible phenotypic effects, a proportion that is approximately seven times higher than that of randomly chosen known genes. These results indicate that many coding sORFs hidden in plant genomes are associated with morphogenesis. We believe that the expression atlas will contribute to further study of the roles of sORFs in plants.transcriptome | phenome | Agilent custom microarray | transgenic plant | peptide hormone I t has been revealed that small ORFs (sORFs; 30-100 amino acids) are translated into peptides that play essential roles in eukaryotes. For example, in yeast, 21 of 247 peptides encoded by sORFs are essential for viability, as identified by KO analyses (1). In Drosophila, several peptides encoded by sORFs are involved in activating transcription factors related to development (2). In plants, a number of peptides encoded by known small genes (<150 codons) play significant roles in various aspects of plant growth and development. Specific receptors for various peptides have been identified as receptor kinases (3-18). Although peptides translated from sORFs have important roles, a high rate of false-positive prediction affects the identification of coding sORFs in genome sequences (19,20). Therefore, in a representative plant species, Arabidopsis thaliana, many small genes had been manually identified using a restricted Markov model and similarity searching (21). To further explore the field of small genes, we developed a computational method to identify coding sORFs using the hexamer composition bias between coding sequences (CDSs) and noncoding sequences (NCDSs) (22, 23). Among available gene finders, this program package has the best performance for identifying true small genes (24).The model plant species A. thaliana has a high-quality genome, and more than 7,000 coding sORFs were identified in the intergenic regions that lacked annotated genes (22). The coding sORFs do not have any sequence similarities to annotated genes. In the present study, to examine the functional roles of these newly identified coding sORFs, we designed an array to generate an expression atlas under 16 developmental stages and 17 environmental conditions, with three replicates. Then, we looked for evidence of expression of coding sORFs. We also examined the signatures of selective constraints on the CDSs among the coding sORFs in 16 land plant sp...
DNA methylation is an important factor regulating gene expression in organisms. However, whether DNA methylation plays a key role in adaptive evolution is unknown. Here, we show evidence of naturally selected DNA methylation in Arabidopsis thaliana. In comparison with single nucleotide polymorphisms, three types of methylation—methylated CGs (mCGs), mCHGs, and mCHHs—contributed highly to variable gene expression levels among an A. thaliana population. Such variably expressed genes largely affect a large variation of specialized metabolic quantities. Among the three types of methylations, only mCGs located in promoter regions of genes associated with specialized metabolites show a selective sweep signature in the A. thaliana population. Thus, naturally selected mCGs appear to be key mutations that cause the expressional diversity associated with specialized metabolites during plant evolution.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.