Delaunay tessellation is applied for the first time in the analysis of protein structure. By representing amino acid residues in protein chains by C alpha atoms, the protein is described as a set of points in three-dimensional space. Delaunay tessellation of a protein structure generates an aggregate of space-filling irregular tetrahedra, or Delaunay simplices. The vertices of each simplex define objectively four nearest neighbor C alpha atoms, i.e., four nearest-neighbor residues. A simplex classification scheme is introduced in which simplices are divided into five classes based on the relative positions of vertex residues in protein primary sequence. Statistical analysis of the residue composition of Delaunay simplices reveals nonrandom preferences for certain quadruplets of amino acids to be clustered together. This nonrandom preference may be used to develop a four-body potential that can be used in evaluating sequence-structure compatibility for the purpose of inverted structure prediction.
CD147, also known as extracellular matrix metalloproteinase inducer, is a regulator of matrix metalloproteinase production and also serves as a signaling receptor for extracellular cyclophilins. Previously, we demonstrated that cell surface expression of CD147 is sensitive to cyclophilin-binding drug cyclosporin A, suggesting involvement of a cyclophilin in the regulation of intracellular transport of CD147. In this report, we identify this cyclophilin as cyclophilin 60 (Cyp60), a distinct member of the cyclophilin family of proteins. CD147 co-immunoprecipitated with Cyp60, and confocal immunofluorescent microscopy revealed intracellular colocalization of Cyp60 and CD147. This interaction with Cyp60 involved proline 211 of CD147, which was shown previously to be critical for interaction between CD147 and another cyclophilin, cyclophilin A, in solution. Mutation of this proline residue abrogated co-immunoprecipitation of CD147 and Cyp60 and reduced surface expression of CD147 on the plasma membrane. Suppression of Cyp60 expression using RNA interference had an effect similar to that of cyclosporin A: reduction of cell surface expression of CD147. These results suggest that Cyp60 plays an important role in the translocation of CD147 to the cell surface. Therefore, Cyp60 may present a novel target for therapeutic interventions in diseases where CD147 functions as a pathogenic factor, such as cancer, human immunodeficiency virus infection, or rheumatoid arthritis.
Motivation: An important area of research in biochemistry and molecular biology focuses on characterization of enzyme mutants. However, synthesis and analysis of experimental mutants is time consuming and expensive. We describe a machine-learning approach for inferring the activity levels of all unexplored single point mutants of an enzyme, based on a training set of such mutants with experimentally measured activity. Results: Based on a Delaunay tessellation-derived four-body statistical potential function, a perturbation vector measuring environmental changes relative to wild type (wt) at every residue position uniquely characterizes each enzyme mutant for model development and prediction. First, a measure of model performance utilizing area (AUC) under the receiver operating characteristic (ROC) curve surpasses 0.83 and 0.77 for data sets of experimental HIV-1 protease and T4 lysozyme mutants, respectively. Additionally, a novel method is introduced for evaluating statistical significance associated with the number of correct test set predictions obtained from a trained model. Third, 100 stratified random splits of the protease and T4 lysozyme mutant data sets into training and test sets achieve 77.0% and 80.8% mean accuracy, respectively. Next, protease and T4 lysozyme models trained with experimental mutants are used to predict activity levels for all remaining mutants; a subsequent search for publications reporting on dozens of these test mutants reveals that experimental results are matched by 79% and 86% of predictions, respectively. Finally, learning curves for each mutant enzyme system indicate the influence of training set size on model performance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.