Keywords conformational equilibrium, intrinsic disorder, cellular signaling, protein ensemble, local unfolding Author Contributions MHC, JOW, JT, VJH designed research; MHC performed research; MHC contributed new analytic tools; MHC, JOW, JT, VJH analyzed data; MHC, JOW, JT, VJH wrote the paper.
AbstractPhosphorylation sites are hyper-abundant in the disordered proteins of eukaryotes, suggesting that conformational dynamics (or heterogeneity) may play a major role in determining to what extent a kinase interacts with a particular substrate. In biophysical terms, substrate selectivity may be determined not just by the structural and chemical complementarity between the kinase and its protein substrates, but also by the free energy difference between the conformational ensembles that are recognized by the kinase and those that are not. To test this hypothesis, we developed an informatics framework based on statistical thermodynamics, which allows us to probe for dynamic contributions to phosphorylation, as evaluated by the ability to predict Ser/Thr/ Tyr phosphorylation sites in the disordered proteome. Essential to this framework is a decomposition of substrate sequence information into two types: vertical information encoding conserved kinase specificity motifs and horizontal (distributed) information encoding substrate conformational dynamics that are embedded, but often not apparent, within position specific conservation patterns. We find not only that conformational dynamics play a major role, but that they are the dominant contribution to substrate selectivity.In fact, the main substrate classifier distinguishing selectivity is the magnitude of change in compaction of the disordered chain upon phosphorylation. Thus, in addition to providing fundamental insights into the underlying mechanistic consequences of phosphorylation across the entire proteome, our approach provides a novel statistical thermodynamic strategy for partitioning any sequence-based search into contributions from direct chemical and structural complementarity and those from changes in conformational dynamics. Using this framework, we developed a high-performance open-source phosphorylation site predictor, PHOSforUS, which is freely available at https://github.com/bxlab/PHOSforUS. Furthermore, our results indicate that the sequence neighborhoods of many Serine (Ser) and Threonine (Thr) phosphorylation sites, specifically those containing Pro immediately C-terminal to the phosphorylated site (i.e. the +1 Pro sequence motif ), are "energetically poised" to undergo a phosphorylation-induced change in the dimensions of the disordered ensemble, suggestive of a direct link between the conformational dimensions of the disordered substrate and its ability to be recognized and phosphorylated.