The cis-regulatory regions on DNA serve as binding sites for proteins such as transcription factors and RNA polymerase. The combinatorial interaction of these proteins plays a crucial role in transcription initiation, which is an important point of control in the regulation of gene expression. We present here an analysis of the performance of an in silico method for predicting cisregulatory regions in the plant genomes of Arabidopsis (Arabidopsis thaliana) and rice (Oryza sativa) on the basis of free energy of DNA melting. For protein-coding genes, we achieve recall and precision of 96% and 42% for Arabidopsis and 97% and 31% for rice, respectively. For noncoding RNA genes, the program gives recall and precision of 94% and 75% for Arabidopsis and 95% and 90% for rice, respectively. Moreover, 96% of the false-positive predictions were located in noncoding regions of primary transcripts, out of which 20% were found in the first intron alone, indicating possible regulatory roles. The predictions for orthologous genes from the two genomes showed a good correlation with respect to prediction scores and promoter organization. Comparison of our results with an existing program for promoter prediction in plant genomes indicates that our method shows improved prediction capability.Sequencing and annotation of a large number of eukaryotic genomes has made available an enormous amount of information regarding genetic coding sequences (CDS). These data can be effectively utilized for studying and modifying the expression of genes if the location and contribution of cis-regulatory regions, which control spatial and temporal regulation of gene expression, are available. However, the precise annotation of regulatory regions is difficult as compared with the identification of genes, primarily because regulatory regions do not code for an identifiable product. In fact, regulatory regions are bound by proteins such as transcription factors, which bring about transcription and its regulation. Determining transcription factor-binding sites (TFBSs) from chromatin immunoprecipitation methods has limitations and requires a lot of downstream data processing (Farnham, 2009). Moreover, the mere binding of a transcription factor at a particular site does not warrant its involvement in the regulation of a gene. Development of computational approaches that enable accurate prediction of cis-regulatory sites could thus greatly aid in deciphering the regulatory mechanisms at the genome level.The preponderance of noncoding DNA in the eukaryotic genome makes it difficult to identify promoter regions. Most efforts toward the prediction of regulatory regions have traditionally focused on the detection of consensus sequences for the TATA box, Initiator elements, TFBSs, etc. Such sequence-based prediction of short motifs might be inadequate because a large number of false hits are possible by chance. Moreover, there is increasing evidence to suggest that consensus sequences vary greatly and are even absent in many cases. The TATA box, which is consi...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.