The molecular function of a protein relies on its structure. Understanding how variants alter structure and function in multidomain proteins is key to elucidate the generation of a pathological phenotype. However, one may fall into the logical bias of assessing protein damage only based on the variants that are visible (survivorship bias), which can lead to partial conclusions. This is the case of PNKP, an important nuclear and mitochondrial DNA repair enzyme with both kinase and phosphatase function. Most variants in PNKP are confined to the kinase domain, leading to a pathological spectrum of three apparently distinct clinical entities. Since proteins and domains may have a different tolerability to variation, we evaluated whether variants in PNKP are under survivorship bias. Here, we provide the evidence that supports a higher tolerance in the kinase domain even when all variants reported are deleterious. Instead, the phosphatase domain is less tolerant due to its lower variant rates, a higher degree of sequence conservation, lower dN/dS ratios, and the presence of more disease-propensity hotspots. Together, our results support previous experimental evidence that demonstrated that the phosphatase domain is functionally more necessary and relevant for DNA repair, especially in the context of the development of the central nervous system. Finally, we propose the term "Wald’s domain" for future studies analyzing the possible survivorship bias in multidomain proteins.
The molecular function of a protein relies on its structure. Understanding how mutations alter structure and function in multi-domain proteins, is key to elucidate how a pathological phenotype is generated. However, one may fall into the logical bias of assessing protein damage only based on the mutations that are viable (survivorship bias), which can lead to partial conclusions. This is the case of PNKP, an important nuclear and mitochondrial DNA repair enzyme with kinase and phosphatase function. Most mutations in PNKP are confined to the kinase domain, leading to a pathological spectrum of three apparently distinct clinical entities. Since proteins and domains may have a different tolerance to disease causing mutations, we evaluated whether mutations in PNKP are under survivorship bias. Even when all mutations in the kinase domain are deleterious, we found a mayor mutation tolerability landscape in terms of survival. Instead, the phosphatase domain is less tolerant due to its low mutation rates, higher degree of sequence conservation, lower dN/dS ratios, and more disease-propensity hotspots. Thus, in multi-domain proteins, we propose the term "Wald's domain" for those who are not apparently more associated with disease, but that are less resistant to mutations in terms of survival. Together, our results support previous experimental evidence that demonstrated that the phosphatase domain is functionally more necessary and relevant for DNA repair, especially in the context of the development of the central nervous system. Thus, this bias should be taken into account when analyzing the mutational landscape in protein structure, function, and finally in disease.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.