The results of this study suggest that statins may reduce the risk of liver cancer.
Existing methods for predicting protein crystallization obtain high accuracy using various types of complemented features and complex ensemble classifiers, such as support vector machine (SVM) and Random Forest classifiers. It is desirable to develop a simple and easily interpretable prediction method with informative sequence features to provide insights into protein crystallization. This study proposes an ensemble method, SCMCRYS, to predict protein crystallization, for which each classifier is built by using a scoring card method (SCM) with estimating propensity scores of p-collocated amino acid (AA) pairs (p = 0 for a dipeptide). The SCM classifier determines the crystallization of a sequence according to a weighted-sum score. The weights are the composition of the p-collocated AA pairs, and the propensity scores of these AA pairs are estimated using a statistic with optimization approach. SCMCRYS predicts the crystallization using a simple voting method from a number of SCM classifiers. The experimental results show that the single SCM classifier utilizing dipeptide composition with accuracy of 73.90% is comparable to the best previously-developed SVM-based classifier, SVM_POLY (74.6%), and our proposed SVM-based classifier utilizing the same dipeptide composition (77.55%). The SCMCRYS method with accuracy of 76.1% is comparable to the state-of-the-art ensemble methods PPCpred (76.8%) and RFCRYS (80.0%), which used the SVM and Random Forest classifiers, respectively. This study also investigates mutagenesis analysis based on SCM and the result reveals the hypothesis that the mutagenesis of surface residues Ala and Cys has large and small probabilities of enhancing protein crystallizability considering the estimated scores of crystallizability and solubility, melting point, molecular weight and conformational entropy of amino acids in a generalized condition. The propensity scores of amino acids and dipeptides for estimating the protein crystallizability can aid biologists in designing mutation of surface residues to enhance protein crystallizability. The source code of SCMCRYS is available at http://iclab.life.nctu.edu.tw/SCMCRYS/.
Background Existing methods for predicting protein solubility on overexpression in Escherichia coli advance performance by using ensemble classifiers such as two-stage support vector machine (SVM) based classifiers and a number of feature types such as physicochemical properties, amino acid and dipeptide composition, accompanied with feature selection. It is desirable to develop a simple and easily interpretable method for predicting protein solubility, compared to existing complex SVM-based methods. Results This study proposes a novel scoring card method (SCM) by using dipeptide composition only to estimate solubility scores of sequences for predicting protein solubility. SCM calculates the propensities of 400 individual dipeptides to be soluble using statistic discrimination between soluble and insoluble proteins of a training data set. Consequently, the propensity scores of all dipeptides are further optimized using an intelligent genetic algorithm. The solubility score of a sequence is determined by the weighted sum of all propensity scores and dipeptide composition. To evaluate SCM by performance comparisons, four data sets with different sizes and variation degrees of experimental conditions were used. The results show that the simple method SCM with interpretable propensities of dipeptides has promising performance, compared with existing SVM-based ensemble methods with a number of feature types. Furthermore, the propensities of dipeptides and solubility scores of sequences can provide insights to protein solubility. For example, the analysis of dipeptide scores shows high propensity of α-helix structure and thermophilic proteins to be soluble. Conclusions The propensities of individual dipeptides to be soluble are varied for proteins under altered experimental conditions. For accurately predicting protein solubility using SCM, it is better to customize the score card of dipeptide propensities by using a training data set under the same specified experimental conditions. The proposed method SCM with solubility scores and dipeptide propensities can be easily applied to the protein function prediction problems that dipeptide composition features play an important role. Availability The used datasets, source codes of SCM, and supplementary files are available at http://iclab.life.nctu.edu.tw/SCM/.
Objective: We studied the association between the statin dosage and the risk of Parkinson disease (PD) in diabetic patients in Taiwan. Methods: One million patients were randomly sampled from a National Health Insurance (NHI) database and followed from 2001 to 2008. Diabetic patients were screened by diagnosis of International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) codes, and statin dosage was determined according to the NHI pharmacy database. PD was diagnosed on the basis of ICD-9-CM codes and anti-Parkinson medication use. Statin users was classified by statin dose-duration-day > 28 and matched with nonusers of statins using a coarsened exact matching method. There were 50,432 patients, and half of them were statin users. We examined the risk of PD between statin users and nonusers of statins and further tested the trends of the relative risk between the statin dosage and PD. Results: The PD incidence rate was lower in statin users than in nonusers of statins. The crude hazard ratio of PD incidence in statin users was 0.65 (95% confidence interval [CI] 5 0.57-0.74) in females and 0.60 (95% CI 5 0.51-0.69) in males compared with nonusers of statins. After Cox regression analysis, all statins except lovastatin exerted protective effects on PD incidence and had a significant dose-dependent trend. Interpretation: In Taiwanese diabetic patients, the risk of PD is lower in statin users than in nonusers of statins. Statin users, except lovastatin users, are dose-dependently associated with a decreased incidence of PD compared with nonusers of statins. This finding provides a new indication for statin beyond lipid control and cardiovascular events in diabetic patients. ANN NEUROL 2016;80:532-540 P arkinson disease (PD) is the second most common neurodegenerative disease. It causes irreversible, progressive disability that is characterized by muscle rigidity, slowing of physical movements, behavioral abnormalities, and autonomic impairments. 1,2 PD causes progressive disability and imposes a burden on patients and their families. The etiology of PD is inconclusive but is associated with multiple factors, such as genetic variety, oxidative stress, and progressive neuroinflammation. 3 According to recent reports, diabetes mellitus (DM) and elevated cardiovascular risk factors might play a role in the development of PD. 4-6 Some studies on animal and human models have indicated that tumor necrosis factora, nitric oxide, inducible nitric oxide synthase (iNOS), and oxygen-derived free radicals play a role in the development of PD. 3,7,8 Statin is a competitive 3-hydroxy-3-methylglutarylcoenzyme A reductase competitive inhibitor and is widely used for treating hypercholesterolemia, particularly for maintaining low-density lipoprotein (LDL) cholesterol View this article online at wileyonlinelibrary.com.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.