The classification of human genetic variants into deleterious and neutral is a challenging issue, whose complexity is rooted in the large variety of biophysical mechanisms that can be responsible for disease conditions. For non-synonymous mutations in structured proteins, one of these is the protein stability change, which can lead to loss of protein structure or function. We developed a stability-driven knowledge-based classifier that uses protein structure, artificial neural networks and solvent accessibility-dependent combinations of statistical potentials to predict whether destabilizing or stabilizing mutations are disease-causing. Our predictor yields a balanced accuracy of 71% in cross validation. As expected, it has a very high positive predictive value of 89%: it predicts with high accuracy the subset of mutations that are deleterious because of stability issues, but is by construction unable of classifying variants that are deleterious for other reasons. Its combination with an evolutionary-based predictor increases the balanced accuracy up to 75%, and allowed predicting more than 1/4 of the variants with 95% positive predictive value. Our method, called SNPMuSiC, can be used with both experimental and modeled structures and compares favorably with other prediction tools on several independent test sets. It constitutes a step towards interpreting variant effects at the molecular scale. SNPMuSiC is freely available at https://soft.dezyme.com/.
Motivation Although structured proteins adopt their lowest free energy conformation in physiological conditions, the individual residues are generally not in their lowest free energy conformation. Residues that are stability weaknesses are often involved in functional regions, whereas stability strengths ensure local structural stability. The detection of strengths and weaknesses provides key information to guide protein engineering experiments aiming to modulate folding and various functional processes. Results We developed the SWOTein predictor which identifies strong and weak residues in proteins on the basis of three types of statistical energy functions describing local interactions along the chain, hydrophobic forces and tertiary interactions. The large-scale analysis of the different types of strengths and weaknesses demonstrated their complementarity and the enhancement of the information they provide. Moreover, a good average correlation was observed between predicted and experimental strengths and weaknesses obtained from native hydrogen exchange data. SWOTein application to three test cases further showed its suitability to predict and interpret strong and weak residues in the context of folding, conformational changes and protein-protein binding. In summary, SWOTein is both fast and accurate and can be applied at small and large scale to analyze and modulate folding and molecular recognition processes. Availability The SWOTein webserver provides the list of predicted strengths and weaknesses and a protein structure visualization tool that facilitates the interpretation of the predictions. It is freely available for academic use at http://babylone.ulb.ac.be/SWOTein/
The classification of human genetic variants into deleterious and neutral is a challenging issue, whose complexity is rooted in the large variety of biophysical mechanisms that can be responsible for disease conditions. For non-synonymous mutations in structured proteins, one of these is the protein stability change, which can lead to functionality loss. We developed a stabilitydriven knowledge-based classifier that uses protein structure, artificial neural networks and solvent accessibility-dependent combinations of statistical potentials to predict whether destabilizing or stabilizing mutations are disease-causing. Our predictor yields a balanced accuracy of 71% in cross validation. As expected, it has a very high positive predictive value of 89%: it predicts with high accuracy the subset of mutations that are deleterious because of stability issues, but is by construction unable of classifying variants that are deleterious for other reasons. Its combination with an evolutionary-based predictor increases the balanced accuracy up to 75%, and allowed predicting more than 1/4 of the deleterious variants with 95% positive predictive value. Our method, called SNPMuSiC, can be used with both experimental and structural models and compares favorably with other prediction tools on several independent test sets. It constitutes a step towards interpreting variant effects at the molecular scale.
With more than forty causative genes identified so far, autosomal dominant cerebellar ataxias exhibit a remarkable genetic heterogeneity. Yet, half the patients are lacking a molecular diagnosis. In a large family with nine sampled affected members, we performed exome sequencing combined with whole-genome linkage analysis. We identified a missense variant in NPTX1, NM_002522.3: c.1165G>A: p.G389R, segregating with the phenotype. Further investigations with whole exome sequencing and an amplicon-based panel identified four additional unrelated families segregating the same variant, for whom a common founder effect could be excluded. A second missense variant, NM_002522.3: c.980A>G: p.E327G, was identified in a fifth familial case. The NPTX1-associated phenotype consists of a late-onset, slowly progressive, cerebellar ataxia, with downbeat nystagmus, cognitive impairment reminiscent of cerebellar cognitive affective syndrome, myoclonic tremor and mild cerebellar vermian atrophy on brain imaging. NPTX1 encodes the neuronal pentraxin 1, a secreted protein with various cellular and synaptic functions. Both variants affect conserved amino-acid residues and are extremely rare or absent from public databases. In COS7 cells, overexpression of both neuronal pentraxin 1 variants altered endoplasmic reticulum morphology and induced ATF6-mediated endoplasmic reticulum stress, associated with cytotoxicity. In addition, the p. E327G variant abolished neuronal pentraxin 1 secretion, as well as its capacity to form a high molecular weight complex with the wild-type protein. Co-immunoprecipitation experiments coupled with mass spectrometry analysis demonstrated abnormal interactions of this variant with the cytoskeleton. In agreement with these observations, in silico modelling of the neuronal pentraxin 1 complex evidenced a destabilizing effect for the p. E327G substitution, located at the interface between monomers. On the contrary, the p. G389 residue, located at the protein surface, had no predictable effect on the complex stability. Our results establish NPTX1 as a new causative gene in autosomal dominant cerebellar ataxias. We suggest that variants in NPTX1 can lead to cerebellar ataxia due to endoplasmic reticulum stress, mediated by ATF6, and associated to a destabilization of NP1 polymers in a dominant-negative manner for one of the variants.
We provide integrated protein sequence-based predictions via https://bio2byte.be/b2btools/. The aim of our predictions is to identify the biophysical behaviour or features of proteins that are not readily captured by structural biology and/or molecular dynamics approaches. Upload of a FASTA file or text input of a sequence provides integrated predictions from DynaMine backbone and side-chain dynamics, conformational propensities, and derived EFoldMine early folding, DisoMine disorder, and Agmata β-sheet aggregation. These predictions, several of which were previously not available online, capture ‘emergent’ properties of proteins, i.e. the inherent biophysical propensities encoded in their sequence, rather than context-dependent behaviour (e.g. final folded state). In addition, upload of a multiple sequence alignment (MSA) in a variety of formats enables exploration of the biophysical variation observed in homologous proteins. The associated plots indicate the biophysical limits of functionally relevant protein behaviour, with unusual residues flagged by a Gaussian mixture model analysis. The prediction results are available as JSON or CSV files and directly accessible via an API. Online visualisation is available as interactive plots, with brief explanations and tutorial pages included. The server and API employ an email-free token-based system that can be used to anonymously access previously generated results.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.