Motivation
Protein complexes are groups of polypeptide chains linked by noncovalent protein-protein interactions (PPIs), which play important roles in biological systems and perform numerous functions, including DNA transcription, mRNA translation, and signal transduction. In the past decade, a number of computational methods have been developed to identify protein complexes from protein interaction networks (PINs) by mining dense subnetworks or subgraphs.
Results
In this paper, different from the existing works, we propose a novel approach for this task based on generative adversarial networks (GANs), which is called PCGAN, meaning identifying Protein Complexes by GAN. With the help of some real complexes as training samples, our method can learn a model to generate new complexes from a PIN. To effectively support model training and testing, we construct two more comprehensive and reliable PINs and a larger gold standard complex set by merging existing ones of the same organism (including human and yeast). Extensive comparison studies indicate that our method is superior to existing protein complex identification methods in terms of various performance metrics. Furthermore, functional enrichment analysis shows that the identified complexes are of high biological significance, which indicates that these generated protein complexes are very possibly real complexes.
Availability
https://github.com/yul-pan/PCGAN.
Supplementary information
Supplementary data are available at Bioinformatics online.