The CAPRI and CASP prediction experiments have demonstrated the power of community wide tests of methodology in assessing the current state of the art and spurring progress in the very challenging areas of protein docking and structure prediction. We sought to bring the power of community wide experiments to bear on a very challenging protein design problem that provides a complementary but equally fundamental test of current understanding of protein-binding thermodynamics. We have generated a number of designed protein-protein interfaces with very favorable computed binding energies but which do not appear to be formed in experiments, suggesting there may be important physical chemistry missing in the energy calculations. 28 research groups took up the challenge of determining what is missing: we provided structures of 87 designed complexes and 120 naturally occurring complexes and asked participants to identify energetic contributions and/or structural features that distinguish between the two sets. The community found that electrostatics and solvation terms partially distinguish the designs from the natural complexes, largely due to the non-polar character of the designed interactions. Beyond this polarity difference, the community found that the designed binding surfaces were on average structurally less embedded in the designed monomers, suggesting that backbone conformational rigidity at the designed surface is important for realization of the designed function. These results can be used to improve computational design strategies, but there is still much to be learned; for example, one designed complex, which does form in experiments, was classified by all metrics as a non-binder.
The elucidation of protein-protein interaction (PPI) networks is important for understanding cellular structure and function and structure-based drug design. However, the development of an effective method to conduct exhaustive PPI screening represents a computational challenge. We have been investigating a protein docking approach based on shape complementarity and physicochemical properties. We describe here the development of the protein-protein docking software package “MEGADOCK” that samples an extremely large number of protein dockings at high speed. MEGADOCK reduces the calculation time required for docking by using several techniques such as a novel scoring function called the real Pairwise Shape Complementarity (rPSC) score. We showed that MEGADOCK is capable of exhaustive PPI screening by completing docking calculations 7.5 times faster than the conventional docking software, ZDOCK, while maintaining an acceptable level of accuracy. When MEGADOCK was applied to a subset of a general benchmark dataset to predict 120 relevant interacting pairs from 120 x 120 = 14,400 combinations of proteins, an F-measure value of 0.231 was obtained. Further, we showed that MEGADOCK can be applied to a large-scale protein-protein interaction-screening problem with accuracy better than random. When our approach is combined with parallel high-performance computing systems, it is now feasible to search and analyze protein-protein interactions while taking into account three-dimensional structures at the interactome scale. MEGADOCK is freely available at .
BackgroundProtein-protein interaction (PPI) plays a core role in cellular functions. Massively parallel supercomputing systems have been actively developed over the past few years, which enable large-scale biological problems to be solved, such as PPI network prediction based on tertiary structures.ResultsWe have developed a high throughput and ultra-fast PPI prediction system based on rigid docking, “MEGADOCK”, by employing a hybrid parallelization (MPI/OpenMP) technique assuming usages on massively parallel supercomputing systems. MEGADOCK displays significantly faster processing speed in the rigid-body docking process that leads to full utilization of protein tertiary structural data for large-scale and network-level problems in systems biology. Moreover, the system was scalable as shown by measurements carried out on two supercomputing environments. We then conducted prediction of biological PPI networks using the post-docking analysis.ConclusionsWe present a new protein-protein docking engine aimed at exhaustive docking of mega-order numbers of protein pairs. The system was shown to be scalable by running on thousands of nodes. The software package is available at: http://www.bi.cs.titech.ac.jp/megadock/k/.
Extended proteins such as calmodulin and troponin C have two globular terminal domains linked by a central region that is exposed to water and often acts as a function-regulating element. The mechanisms that stabilize the tertiary structure of extended proteins appear to differ greatly from those of globular proteins. Identifying such differences in physical properties of amino acid sequences between extended proteins and globular proteins can provide clues useful for identification of extended proteins from complete genomes including orphan sequences. In the present study, we examined the structure and amino acid sequence of extended proteins. We found that extended proteins have a large net electric charge, high charge density, and an even balance of charge between the terminal domains, indicating that electrostatic interaction is a dominant factor in stabilization of extended proteins. Additionally, the central domain exposed to water contained many amphiphilic residues. Extended proteins can be identified from these physical properties of the tertiary structure, which can be deduced from the amino acid sequence. Analysis of physical properties of amino acid sequences can provide clues to the mechanism of protein folding. Also, structural changes in extended proteins may be caused by formation of molecular complexes. Long-range effects of electrostatic interactions also appear to play important roles in structural changes of extended proteins.Keywords: structural classification; extended protein; bioinformatics; structural genomics; mechanism of structural stabilization; physical properties of amino acid residues Complete genomes include many orphan amino acid sequences, the functions and structures of which are unknown. Determination of the tertiary structure of the proteins corresponding to these sequences is important for elucidation of their function, because the structure of a protein is closely related to its function. However, some types of proteins with unknown structures, such as nonglobular proteins containing flexible extended segments, are difficult to crystallize. Nonglobular soluble proteins with flexible segments are often involved in regulatory and cell-signaling functions (Wright and Dyson 1999;Ward et al. 2004). For example, calmodulin and troponin C appear to be nonglobular soluble extended proteins.Such extended proteins provide very interesting problems involving structure and changes in structure. First, extended proteins lack some physical properties of globular proteins, and vice versa. The structure of single extended protein molecules, as exemplified by calmodulin and troponin C, consists of separate domains near each terminal linked by a central segment exposed to water (Babu et al. 1988;Houdusse et al. 1997;Chou et al. 2001). In contrast, globular proteins are stabilized by a hydrophobic core (Kauzmann 1959).Second, extended proteins often contain a flexible segment, which allows changes in their structure to occur. For example, the central part of the region linking the te...
Analysis of protein-protein interaction networks has revealed the presence of proteins with multiple interaction ligand proteins, such as hub proteins. For such proteins, multiple ligands would be predicted as interacting partners when predicting all-to-all protein-protein interactions (PPIs). In this work, to obtain a better understanding of PPI mechanisms, we focused on protein interaction surfaces, which differ between protein pairs. We then performed rigid-body docking to obtain information of interfaces of a set of decoy structures, which include many possible interaction surfaces between a certain protein pair. Then, we investigated the specificity of sets of decoy interactions between true binding partners in each case of alpha-chymotrypsin, actin, and cyclin-dependent kinase 2 as test proteins having multiple true binding partners. To observe differences in interaction surfaces of docking decoys, we introduced broad interaction profiles (BIPs), generated by assembling interaction profiles of decoys for each protein pair. After cluster analysis, the specificity of BIPs of true binding partners was observed for each receptor. We used two types of BIPs: those involved in amino acid sequences (BIP-seqs) and those involved in the compositions of interacting amino acid residue pairs (BIP-AAs). The specificity of a BIP was defined as the number of group members including all true binding partners. We found that BIP-AA cases were more specific than BIP-seq cases. These results indicated that the composition of interacting amino acid residue pairs was sufficient for determining the properties of protein interaction surfaces.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.