This is an accepted version of a paper published in Biochimica et Biophysica ActaProteins and Proteomics. This paper has been peer-reviewed but does not include the final publisher proof-corrections or journal pagination.Citation for the published paper: Light, S., Sagit, R., Ekman, D., Elofsson, A. (2013) "Long indels are disordered: A study of disorder and indels in homologous eukaryotic proteins" Biochimica et Biophysica Acta -Proteins and Proteomics, 1834(5): 890-897 Access to the published version may require subscription.
AbstractProteins evolve through point mutations as well as by insertions and deletions (indels). During the last decade it has become apparent that protein regions that do not fold into three-dimensional structures, i.e. intrinsically disordered regions, are quite common. Here, we have studied the relationship between protein disorder and indels using HMM-HMM pairwise alignments in two sets of orthologous eukaryotic protein pairs. First, we show that disordered residues are much more frequent among indel residues than among aligned residues and, also are more prevalent among indels than in coils. Second, we observed that disordered residues are particularly common in longer indels. Disordered indels of short-to-medium size are prevalent in the nonterminal regions of proteins while the longest indels, ordered and disordered alike, occur toward the termini of the proteins where new structural units are comparatively well tolerated. Finally, while disordered regions often evolve faster than ordered regions and disorder is common in indels, there are some previously recognized protein families where the disordered region is more conserved than the ordered region. We find that these rare proteins are often involved in information processes, such as RNA processing and translation.