Essential proteins are indispensable for the survival or reproduction of an organism. Identification of essential proteins is not only necessary for the understanding of the minimal requirements for cellular life, but also important for the disease study and drug design. With the development of high-throughput techniques, a large number of protein-protein interaction data are available, which promotes the studies of essential proteins from the network level. Up to now, though a series of computational methods have been proposed, the prediction precision still needs to be improved. In this paper, we propose a new method, United complex Centrality (UC), to identify essential proteins by integrating the protein complexes with the topological features of protein-protein interaction (PPI) networks. By analyzing the relationship between the essential proteins and the known protein complexes of S. cerevisiae and human, we find that the proteins in complexes are more likely to be essential compared with the proteins not included in any complexes and the proteins appeared in multiple complexes are more inclined to be essential compared to those only appeared in a single complex. Considering that some protein complexes generated by computational methods are inaccurate, we also provide a modified version of UC with parameter alpha, named UC-P. The experimental results show that protein complex information can help identify the essential proteins more accurate both for the PPI network of S. cerevisiae and that of human. The proposed method UC performs obviously better than the eight previously proposed methods (DC, IC, EC, SC, BC, CC, NC, and LAC) for identifying essential proteins.
Essential proteins are those necessary for the survival or reproduction of species and discovering such essential proteins is fundamental for understanding the minimal requirements for cellular life, which is also meaningful to the disease study and drug design. With the development of high-throughput techniques, a large number of Protein-Protein Interactions (PPIs) can be used to identify essential proteins at the network level. Up to now, though a series of network-based computational methods have been proposed, it is still a challenge to improve the prediction precision as the high false positives in PPI networks. In this paper, we propose a new method GOS to identify essential proteins by integrating the Gene expressions, Orthology, and Subcellular localization information.The gene expressions and subcellular localization information are used to determine whether a neighbor in the PPI network is reliable. Only reliable neighbors are considered when we analyze the topological characteristics of a protein in a PPI network. We also analyze the orthologous attributes of each protein to reflect its conservative features, and use a random walk model to integrate a protein's topological characteristics and its orthology. The experimental results on the yeast PPI network show that the proposed method GOS outperforms the ten existing methods DC,
Abstract. Essential proteins are indispensable in maintaining the cellular life. Identification of essential proteins can provide basis for drug target design, disease treatment as well as synthetic biology minimal genome. However, it is still time-consuming and expensive to identify essential protein based on experimental approaches. With the development of high-throughput experimental techniques in the post-genome era, a large number of PPI data and gene expression data can be obtained, which provide an unprecedented opportunity to study essential proteins at the network level. So far, many network topological methods have been proposed to identify the essential proteins. In this paper, we propose a new method, United complex Centrality(UC), to identify essential proteins by integrating protein complexes information and topological features of PPI network. By analysis of the relationship between protein complexes and essential proteins, we find that proteins appeared in multiple complexes are more inclined to be essential compared to these only appeared in a single complex. The experiment results show that protein complex information can help identify the essential proteins more accurate. Our method UC is obviously better than traditional centrality methods(DC, IC, EC, SC, BC, CC, NC) for identifying essential proteins. In addition, even compared with Harmonic Centricity which also used protein complexes information, it still has a great advantage.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.