Identification of Phase-Separation-Protein-Related Function Based on Gene Ontology by Using Machine Learning Methods

Ma, Qinglan; Huang, FeiMing; Guo, Wei; Kong, Feng; Huang, Tao; Cai, Yu-Dong

doi:10.3390/life13061306

Cited by 1 publication

(1 citation statement)

References 71 publications

(95 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To complement wet-lab studies, several computational approaches have been developed to study phase separation processes. Most approaches in this space have focused on predicting one-dimensional propensity scores 22 – 26 but not the composition of heteromolecular condensates. In order to address this challenge, here, we developed a framework that utilised available experimental data to train machine learning models that define the condensation-prone proteome and subsequently combined this information with biomolecular interaction profiles to generate a Protein Condensate Atlas, which predicts the composition of heteromolecular condensates.…”

Section: Introductionmentioning

confidence: 99%

Protein Condensate Atlas from predictive models of heteromolecular condensate composition

Saar,

Scrutton,

Bloznelyte

et al. 2024

Nat Commun

View full text Add to dashboard Cite

Biomolecular condensates help cells organise their content in space and time. Cells harbour a variety of condensate types with diverse composition and many are likely yet to be discovered. Here, we develop a methodology to predict the composition of biomolecular condensates. We first analyse available proteomics data of cellular condensates and find that the biophysical features that determine protein localisation into condensates differ from known drivers of homotypic phase separation processes, with charge mediated protein-RNA and hydrophobicity mediated protein-protein interactions playing a key role in the former process. We then develop a machine learning model that links protein sequence to its propensity to localise into heteromolecular condensates. We apply the model across the proteome and find many of the top-ranked targets outside the original training data to localise into condensates as confirmed by orthogonal immunohistochemical staining imaging. Finally, we segment the condensation-prone proteome into condensate types based on an overlap with biomolecular interaction profiles to generate a Protein Condensate Atlas. Several condensate clusters within the Atlas closely match the composition of experimentally characterised condensates or regions within them, suggesting that the Atlas can be valuable for identifying additional components within known condensate systems and discovering previously uncharacterised condensates.

show abstract

Section: Introductionmentioning

confidence: 99%