Predicting promoter activity of DNA fragment is an important task for computational biology. Approaches using physical properties of DNA to predict bacterial promoters have recently gained a lot of attention. To select an adequate set of physical properties for training a classifier, various characteristics of DNA molecule should be taken into consideration. Here, we present a systematic approach that allows us to select less correlated properties for classification by means of both correlation and cophenetic coefficients as well as concordance matrices. To prove this concept, we have developed the first classifier that uses not only sequence and static physical properties of DNA fragment, but also dynamic properties of DNA open states. Therefore, the best performing models with accuracy values up to 90% for all types of sequences were obtained. Furthermore, we have demonstrated that the classifier can serve as a reliable tool enabling promoter DNA fragments to be distinguished from promoter islands despite the similarity of their nucleotide sequences.
In luminous bacteria NAD(P)H:flavin‐oxidoreductases LuxG and Fre, there are homologous enzymes that could provide a luciferase with reduced flavin. Although Fre functions as a housekeeping enzyme, LuxG appears to be a source of reduced flavin for bioluminescence as it is transcribed together with luciferase. This study is aimed at providing the basic conception of Fre and LuxG evolution and revealing the peculiarities of the active site structure resulted from a functional variation within the oxidoreductase family. A phylogenetic analysis has demonstrated that Fre and LuxG oxidoreductases have evolved separately after the gene duplication event, and consequently, they have acquired changes in the conservation of functionally related sites. Namely, different evolutionary rates have been observed at the site responsible for specificity to flavin substrate (Arg 46). Also, Tyr 72 forming a part of a mobile loop involved in FAD binding has been found to be conserved among Fre in contrast to LuxG oxidoreductases. The conservation of different amino acid types in NAD(P)H binding site has been defined for Fre (arginine) and LuxG (proline) oxidoreductases.
The functioning of DNA regulatory regions rely primarily on their physicochemical and structural properties but not on nucleotide sequences, i.e. 'genetic text'. The formers are responsible for coding of DNA-protein interactions that govern various regulatory events. One of the characteristics is SIDD (Stress-Induced Duplex Destabilization) that quantify DNA duplex region propensity to melt under the imposed superhelical stress. The duplex property has been shown to participate in activity of various regulatory regions. Here we employ the SIDD model to calculate melting probability profiles for T7 bacteriophage promoter sequences. The genome is characterized by small size (approximately 40 thousand nucleotides) and temporal organization of expression: at the first stage of infection early T7 DNA region is transcribed by the host cell RNA polymerase, later on in life cycle phage-specific RNA polymerase performs transcription of class II and class III genes regions. Differential recognition of a particular group of promoters by the enzyme cannot be solely explained by their nucleotide sequences, because of, among other reasons, it is fairly similar among most the promoters. At the same time SIDD profiles obtained vary significantly and are clearly separated into groups corresponding to functional promoter classes of T7 DNA. For example, early promoters are affected by the same maximally destabilized DNA duplex region located at the varying region of a particular promoter. class II promoters lack substantially destabilized regions close to transcription start sites. Class III promoters, in contrast, demonstrate characteristic melting probability maxima located in the near-downstream region in all cases. Therefore, the apparent differences among the promoter groups with exceptional textual similarity (class II and class III differ by only few singular substitutions) were established. This confirms the major impact of DNA primary structure on the duplex parameter as well as a need for a broad genetic context consideration. The differences in melting probability profiles obtained using SIDD model alongside with other DNA physicochemical properties appears to be involved in differential promoter recognition by RNA polymerases.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.