We present a method for predicting protein folding class based on global protein chain description and a voting process. Selection of the best descriptors was achieved by a computer-simulated neural network trained on a data base consisting of 83 folding classes. Protein-chain descriptors include overall composition, transition, and distribution of amino acid attributes, such as relative hydrophobicity, predicted secondary structure, and predicted solvent exposure. Cross-validation testing was performed on 15 of the largest classes. The test shows that proteins were assigned to the correct class (correct positive prediction) with an average accuracy of 71.7%, whereas the inverse prediction of proteins as not belonging to a particular class (correct negative prediction) was 90-95% accurate. When tested on 254 structures used in this study, the top two predictions contained the correct class in 91% of the cases.Examination of three-dimensional (3D) structures of proteins determined by x-ray diffraction and NMR has shown that the variety of folding patterns of proteins is significantly restricted (1, 2). Since protein sequence information grows significantly faster than information on protein 3D structure, the need for predicting the folding pattern of a given protein sequence naturally arises. Since the first relatively full classification of folding patterns of globular proteins (3), researchers have developed various schemes for classification of protein 3D structures (4-6) that are essentially based on the same spatial motifs.If the prediction is restricted to a small number of structural classes (less than five), a prediction performance >70% can be easily achieved by using various methods based on a simple representation of sequences as vectors of a small number of general parameters. In the simplest classification, proteins are usually described in terms of the following "tertiary super classes:" all a (proteins have only a-helix secondary structure), all 13 (mainly 3-sheet secondary structure), a+0 (a-helix and {3-strand secondary structure segments that do not mix), a/13 (mixed or alternating segments of a-helical and 13-strand secondary structure), and irregular (7-9). Several statistical methods were developed to predict whether a protein belongs to one of these classes (10)(11)(12)(13)(14)(15)(16)(17). In a recent study on predicting protein structural class (all a, all 1, or composed of a and 1 elements) from amino acid composition and hydrophobic pattern frequency information using computer-simulated neural networks (NNs) and statistical clustering, Metfessel et al.(18) obtained a prediction accuracy of 80.2%. Consideration of specific features of folding classes in the form of so-called hidden Markov models or probabilistic grammars allows a >2-fold increase in the number of classes of recognition (9). This method accurately predicts 12 classes; however, the study gives test results only for 16 sequences.It is obvious that difficulty of folding pattern prediction grows rapidly with the number...
The crystal structure of the RNA dodecamer duplex (r-GGACUUCGGUCC)2 has been determined. The dodecamers stack end-to-end in the crystal, simulating infinite A-form helices with only a break in the phosphodiester chain. These infinite helices are held together in the crystal by hydrogen bonding between ribose hydroxyl groups and a variety of donors and acceptors. The four noncomplementary nucleotides in the middle of the sequence did not form an internal loop, but rather a highly regular double-helix incorporating the non-Watson-Crick base pairs, G.U and U.C. This is the first direct observation of a U.C (or T.C) base pair in a crystal structure. The U.C pairs each form only a single base-base hydrogen bond, but are stabilized by a water molecule which bridges between the ring nitrogens and by four waters in the major groove which link the bases and phosphates. The lack of distortion introduced in the double helix by the U.C mismatch may explain its low efficiency of repair in DNA. The G.U wobble pair is also stabilized by a minor-groove water which bridges between the unpaired guanine amino and the ribose hydroxyl of the uracil. This structure emphasizes the importance of specific hydrogen bonding between not only the nucleotide bases, but also the ribose hydroxyls, phosphate oxygens and tightly bound waters in stabilization of the intramolecular and intermolecular structures of double helical RNA.
The x-ray crystal structure of a 417-nt ribonuclease P RNA from Bacillus stearothermophilus was solved to 3.3-Å resolution. This RNA enzyme is constructed from a number of coaxially stacked helical domains joined together by local and long-range interactions. These helical domains are arranged to form a remarkably flat surface, which is implicated by a wealth of biochemical data in the binding and cleavage of the precursors of transfer RNA substrate. Previous photoaffinity crosslinking data are used to position the substrate on the crystal structure and to identify the chemically active site of the ribozyme. This site is located in a highly conserved core structure formed by intricately interlaced long-range interactions between interhelical sequences.ribozyme ͉ RNA crystallography ͉ tRNA processing R Nase P catalyzes hydrolysis of a phosphodiester bond in precursors of transfer RNA (tRNA) to form the 5Ј-phosphorylated mature tRNA with the release of a 5Ј-precursor fragment (1, 2). RNase P homologs occur in all organisms, and the cellular RNase P always is a ribonucleoprotein that consists of one large RNA and one or more protein component. In bacteria, RNase P is typically comprised of a 350-to 400-nt RNA and one Ϸ120-aa basic protein. Although both RNA and protein components are necessary for cell viability, in vitro at high salt concentrations, bacterial RNase P RNA can act as a catalyst independently of protein (3). Bacterial RNase P is a ribozyme, an RNA-based enzyme.Knowledge of the structure of RNase P RNA is essential for understanding its function, and structure has been the focus of numerous studies of the RNA. Phylogenetic comparative analyses of RNase P RNA sequences have established the secondary and some tertiary structure of the RNA in a broad diversity of organisms (4-8). Photochemical crosslinking studies provided structural information to orient the helical elements and identified nucleotides associated with the active site of the RNA (9, 10). There are two major structural types of bacterial RNase P RNA, A (ancestral) and B (Bacillus), which differ in a number of structural elements attached to a homologous conserved structure. About two-thirds of any bacterial RNase P RNA is shown by sequence covariations to be involved in Watson-Crick base-pairing interactions, but the interactions that form the global structure have been speculative.To gain a better understanding of bacterial RNase P, we crystallized and solved the structure of a 417-nt B-type RNase P RNA from Bacillus stearothermophilus, a moderately thermophilic, low GϩC Gram-positive bacterium. Although the structure does not yet explain the chemical mechanism of catalysis, it is in agreement with a wealth of available biochemical and comparative data, and it provides a structural context for the chemically active site of this ribozyme. Materials and MethodsRNA Purification, Crystallization, and Data Collection. As detailed in supporting information, which is published on the PNAS web site, RNA was transcribed in vitro with T7 phage RNA ...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.