Abstract-In this paper we present a network approach based on the recent developed 3D-BLAST method of rapid protein structure search. We defined new local segments that represent structural feature of proteins named units of structural alphabet (USA). Each USA is composed of two protein secondary structures, and one loop located between these two secondary structures. We performed all-against-all structural comparison of USA and recognized the USA-based similarity network. The analytical result shows that the network with a power degree distribution is called scale free. These results not only suggest the existence of organizing principles in the local protein structure but also allow us to identify potential key fragments that could be useful for future new drug development and design.Index Terms-Local structure similarity network, network biology, protein modularity.
I. INTRODUCTIONIn the past few decades, genomics (DNA sequences), structural genomics (protein structures), and proteomics (protein expression and interactions) have rapidly enhanced knowledge on biological functions and systems. With structural models developed using genome-wide investigative strategies [1]-[3], the number of protein structures in the Protein Data Bank (PDB) has rapidly increased. By Dec. 25, 2012, there were already more than 87,090 known protein structures [4]. The increasing number of known protein structures with unknown/unassigned functions emphasizes the demand for effective bioinformatics methods for annotating the structural homology or evolutionary family of proteins and inferring their cellular functions.The comparison and analysis of the relationship between new protein structures with unclear functions and well-known structures seeks to bridge the protein structure-function research gap. Given a query protein structure, we may search through the database and report similar protein structures. However, unlike one-dimensional sequence comparison, structural alignment for determining similarities is much more complex and computationally expensive. Some methods can be used for efficient pair-wise Manuscript received January 15, 2013; revised March 15, 2013. This work was supported in part by the National Science Council (NSC), Taiwan, under Contract of NSC 101-2311-B-216-001 and 101-2221-E-216-041.Chi-Hua Tung is with the Department of Bioinformatics, Chung-Hua University, Hsinchu 300, Taiwan (e-mail: chihua.tung@chu.edu.tw).Jose C. Nacher is with Department of Information Science, Faculty of Science, Toho University, Miyama 2-2-1, Funabashi, Chiba 274-8510, Japan (e-mail: nacher@is.sci.toho-u.ac.jp). structural comparison [5], but these methods entail an exhaustive search to compare the query structure against all protein structures in the database.To bridge the current protein structure-function research gap and address anterior questions, many approaches have been proposed for encoding 3D local structural fragments based on Cartesian coordinates into a one-dimensional representation using several letters called the str...