-Sequence alignment algorithms and database search methods use BLOSUM and PAM substitution matrices constructed from general proteins. These de facto matrices are not optimal to align sequences accurately, for the proteins with markedly different compositional bias in the amino acid.In this work, a new amino acid substitution matrix is calculated for the disorder and low complexity rich region of Hub proteins, based on residue characteristics. Insights into the amino acid background frequencies and the substitution scores obtained from the Hubsm unveils the residue substitution patterns which differs from commonly used scoring matrices .When comparing the Hub protein sequences for detecting homologs, the use of this Hubsm matrix yields better results than PAM and BLOSUM matrices. Usage of Hubsm matrix can be optimal in database search and for the construction of more accurate sequence alignments of Hub proteins. . Many studies confirmed that, disorder order regions of hub proteins play a key role in interacting with multiple partners and involved in cell signalling pathways [7], [8], [9]. Computational studies suggest that, the occurrence of disorder region is significantly higher in eukaryotic proteome when compared to prokaryotic proteome.
Keywords ---[10], [11],[12]. This prevalence is due to the more complex signaling and regulatory pathways of eukaryotic proteome is heavily relied on disordered proteins. Disorder region of hub proteins exhibit low complexity amino acid compositions [13], [14] and internal repeats [15]. Zsuzsanna et al studied the protein disorder and the regions of low complexity in the interaction networks of eukaryotic proteome such as D. Melanogaster, C.elegans, S.cerevisiae and H.sapiens. The study suggests that the hub proteins tends to be larger and exhibit more frequent disorder and low complexity regions, significantly serving as a structural basis for the many fold interactions of hub proteins [16]. Also research brings out that hub proteins having more protein -protein interactions evolve at a very slow rate than the normal proteins. Kim et al revealed that the rate of mutation of hub is not only influenced by the number of its interacting partners, but also, by the amount of the protein surface involved in interaction with other proteins [17].It has been shown in previous studies that the hub proteome is strongly biased towards certain amino acids. The large part of this bias is accounted by frequent peculiar low-complexity sequences, characterized by a redundant usage of few amino acids. Also these amino acids evolve at a different mutation rate [17]. Amino acids of disorder region lacks hydrophobic amino acid, contains more hydrophilic and charged residues [18] and, the low complexity region is enriched with cysteine and glutamine amino acids [19].The importance of hub proteins in interaction networks and signalling pathways have aided in developing many computational tools and structural studies. These approaches require similarity search and sequence alignment as the...