Environmental stress leads to dramatic transcriptional reprogramming, which is central to plant survival. Although substantial knowledge has accumulated on how a few plant cis-regulatory elements (CREs) function in stress regulation, many more CREs remain to be discovered. In addition, the plant stress cis-regulatory code, i.e., how CREs work independently and/or in concert to specify stress-responsive transcription, is mostly unknown. On the basis of gene expression patterns under multiple stresses, we identified a large number of putative CREs (pCREs) in Arabidopsis thaliana with characteristics of authentic cis-elements. Surprisingly, biotic and abiotic responses are mostly mediated by two distinct pCRE superfamilies. In addition, we uncovered cis-regulatory codes specifying how pCRE presence and absence, combinatorial relationships, location, and copy number can be used to predict stress-responsive expression. Expression prediction models based on pCRE combinations perform significantly better than those based on simply pCRE presence and absence, location, and copy number. Furthermore, instead of a few master combinatorial rules for each stress condition, many rules were discovered, and each appears to control only a small subset of stress-responsive genes. Given there are very few documented interactions between plant CREs, the combinatorial rules we have uncovered significantly contribute to a better understanding of the cis-regulatory logic underlying plant stress response and provide prioritized targets for experimentation. machine learning | motif discovery | transcription factor binding site
Essential genes represent critical cellular components whose disruption results in lethality. Characteristics shared among essential genes have been uncovered in fungal and metazoan model systems. However, features associated with plant essential genes are largely unknown and the full set of essential genes remains to be discovered in any plant species. Here, we show that essential genes in Arabidopsis thaliana have distinct features useful for constructing within-and cross-species prediction models. Essential genes in A. thaliana are often single copy or derived from older duplications, highly and broadly expressed, slow evolving, and highly connected within molecular networks compared with genes with nonlethal mutant phenotypes. These gene features allowed the application of machine learning methods that predicted known lethal genes as well as an additional 1970 likely essential genes without documented phenotypes. Prediction models from A. thaliana could also be applied to predict Oryza sativa and Saccharomyces cerevisiae essential genes. Importantly, successful predictions drew upon many features, while any single feature was not sufficient. Our findings show that essential genes can be distinguished from genes with nonlethal phenotypes using features that are similar across kingdoms and indicate the possibility for translational application of our approach to species without extensive functional genomic and phenomic resources.
Nucleosome positioning influences the access of transcription factors (TFs) to their binding sites and gene expression. Studies in plant, animal, and fungal models demonstrate similar nucleosome positioning patterns along genes and correlations between occupancy and expression. However, the relationships among nucleosome positioning, cis-regulatory element accessibility, and gene expression in plants remain undefined. Here we showed that plant nucleosome depletion occurs on specific 6-mer motifs and this sequence-specific nucleosome depletion is predictive of expression levels. Nucleosome-depleted regions in Arabidopsis thaliana tend to have higher G/C content, unlike yeast, and are centered on specific G/C-rich 6-mers, suggesting that intrinsic sequence properties, such as G/C content, cannot fully explain plant nucleosome positioning. These 6-mer motif sites showed higher DNase I hypersensitivity and are flanked by strongly phased nucleosomes, consistent with known TF binding sites. Intriguingly, this 6-mer-specific nucleosome depletion pattern occurs not only in promoter but also in genic regions and is significantly correlated with higher gene expression level, a phenomenon also found in rice but not in yeast. Among the 6-mer motifs enriched in genes responsive to treatment with the defense hormone jasmonate, there are no significant changes in nucleosome occupancy, suggesting that these sites are potentially preconditioned to enable rapid response without changing chromatin state significantly. Our study provides a global assessment of the joint contribution of nucleosome occupancy and motif sequences that are likely cis-elements to the control of gene expression in plants. Our findings pave the way for further understanding the impact of chromatin state on plant transcriptional regulatory circuits.
ORCID IDs: 0000-0003-0863-0384 (S.U.); 0000-0001-6470-235X (S.-H.S.). Plants are exposed to a variety of environmental conditions, and their ability to respond to environmental variation depends on the proper regulation of gene expression in an organ-, tissue-, and cell type-specific manner. Although our knowledge of how stress responses are regulated is accumulating, a genome-wide model of how plant transcription factors (TFs) and cis-regulatory elements control spatially specific stress response has yet to emerge. Using Arabidopsis (Arabidopsis thaliana) as a model, we identified a set of 1,894 putative cis-regulatory elements (pCREs) that are associated with high-salinity (salt) up-regulated genes in the root or the shoot. We used these pCREs to develop computational models that can better predict salt up-regulated genes in the root and shoot compared with models based on known TF binding motifs. In addition, we incorporated TF binding sites identified via large-scale in vitro assays, chromatin accessibility, evolutionary conservation, and pCRE combinatorial relationships in machine learning models and found that only consideration of pCRE combinations led to better performance in salt up-regulation prediction in the root and shoot. Our results suggest that the plant organ transcriptional response to high salinity is regulated by a core set of pCREs and provide a genome-wide view of the cis-regulatory code of plant spatial transcriptional responses to environmental stress.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.