Despite some previous examples of successful application to the field of pharmacogenomics, the utility of machine learning (ML) techniques for warfarin dose predictions in Caribbean Hispanic patients has yet to be fully evaluated. This study compares seven ML methods to predict warfarin dosing in Caribbean Hispanics. This is a secondary analysis of genetic and non-genetic clinical data from 190 cardiovascular Hispanic patients. Seven ML algorithms were applied to the data. Data was divided into 80 and 20% to be used as training and test sets. ML algorithms were trained with the training set to obtain the models. Model performance was determined by computing the corresponding mean absolute error (MAE) and % patients whose predicted optimal dose were within ±20% of the actual stabilization dose, and then compared between groups of patients with "normal" (i.e., > 21 but <49 mg/week), low (i.e., ≤21 mg/week, "sensitive"), and high (i.e., ≥49 mg/week, "resistant") dose requirements. Random forest regression (RFR) significantly outperform all other methods, with a MAE of 4.73 mg/week and 80.56% of cases within ±20% of the actual stabilization dose. Among those with "normal" dose requirements, RFR performance is also better than the rest of models (MAE = 2.91 mg/week). In the "sensitive" group, support vector regression (SVR) shows superiority over the others with lower MAE of 4.79 mg/week. Finally, multivariate adaptive splines (MARS) shows the best performance in the resistant group (MAE = 7.22 mg/week) and 66.7% of predictions within ±20%. Models generated by using RFR, MARS, and SVR algorithms showed significantly better predictions of weekly warfarin dosing in the studied cohorts than other algorithms. Better performance of the ML models for patients with "normal," "sensitive," and "resistant" to warfarin were obtained when compared to other populations and previous statistical models.
Non-Hispanic whites present with higher atrial fibrillation (AF) prevalence than other racial minorities living in the mainland USA. In two hospital-based studies, Puerto Rican Hispanics had a lower prevalence of atrial fibrillation of 2.5% than non-Hispanic Whites with 5.7%. This data is particularly controversial because Hispanics possess a higher prevalence of traditional risk factors for developing AF yet have a lower AF prevalence. This phenomenon is known as the atrial fibrillation paradox. Despite recent advancements in understanding AF, its pathogenesis remains unclear. In this study, we compared a genetic dataset of Puerto Rican Hispanics to 111 SNP known to be associated with AF in a large European cohort and determine if they are associated with AF susceptibility in our cohort. To achieve this aim, we performed a secondary analysis of existing data using the following two studies: (1) The Pharmacogenetics of Warfarin in Puerto Ricans study and the (2) A Genomic Approach for Clopidogrel in Caribbean Hispanics, and assess for the presence of European SNPs associated with AF from the genome-wide association study of 1 million people identifies 111 loci for atrial fibrillation. We used data from 555 cardiovascular Puerto Rican Hispanic patients, consisting of 486 control and 69 cases. We found that the following SNPs showed significant association with AF in PHR: rs2834618, rs6462079, rs7508, rs2040862, and rs10458660. Some of these SNPs are proteins involved in lysosomal activities responsible for breaking ceramides to sphingosines and collagen deposition around atrial cardiomyocytes. Furthermore, we performed a machine learning analysis and determined that Native American admixture and heart failure were strongly predictive of AF in PHR. For the first time, this study provides some genetic insight into AF’s mechanisms in a Puerto Rican Hispanic cohort.
OBJECTIVES/GOALS: To summarize baseline characteristics and risk factors for major adverse cardiovascular events (MACEs) and develop a prediction model by testing the association between genetic variants and MACEs in Caribbean Hispanic patients on clopidogrel using machine-learning (ML) techniques. METHODS/STUDY POPULATION: This is a secondary analysis of available clinical and genomic data from an existing database of 600 Caribbean Hispanic cardiovascular (CV) patients on clopidogrel. MACEs is defined as the composite of all-cause death, myocardial infarction, stroke and stent thrombosis over 6 months. Dataset is divided into training (60%) and testing (40%) sets, respectively. Two different supervised ML approaches (i.e. multiclass classification and regression algorithms) are applied to the study dataset using Python v3.5 and WEKA, and tested by receiver operating curve (ROC) analysis. A case-control association analysis between MACEs at 6 months and genotypes is performed by using chi-squared test. RESULTS/ANTICIPATED RESULTS: Average age of participants was 68 years-old, 55% males, with high prevalence of risk factors (i.e., overweight: 28.4 kg/m2; hypertension: 83.8%; hypercholesterolemia: 71.9% and diabetes: 54.8%). MACEs rate is 13.8%, with 33.5% resistant to clopidogrel. Logistic regression, KNN and gradient boosting showed the best performance, as suggested by ROC analysis and AUC CV scores of 0.6-0.7. A significant association between MACE occurrence and ≥3 risk alleles was found (OR=8.17; p=0.041). We anticipate that these genetic variants (CYP2C19*2, rs12777823, PON1-rs662, ABCB1-rs2032582, PEAR1-rs12041331) will uniquely contribute to clopidogrel resistance and MACEs in Caribbean Hispanics. DISCUSSION/SIGNIFICANCE: Our findings help address in part the long-standing problem of excluding minorities from research, which entails a gap of knowledge about clopidogrel pharmacogenomics in Puerto Ricans. This study provides a possible ML model that integrates clinical and pharmacogenomics for MACE risk estimation.
Antimicrobial and antiviral resistances are worldwide public health threats, causing treatment failures and increasing morbidity and mortality. Mycobacterium tuberculosis and Plasmodium falciparum, among many others, are examples of multidrug-resistant pathogens requiring combinatorial drug therapy. The use of drug combinations acting in synergy represents an approach to enhance therapy and delay the development of drug resistance. Computational approaches can be used to develop predictive models assessing synergistic drug combinations to reduce the time and cost associated with standard experimental screening. Here, we describe the development of a computational tool (Machine Learning Synergy Predictor - ML-SyPred) that incorporates drug/compound features using machine learning algorithms to predict synergistic drug combinations. Using the ML-SyPred tool, we implemented a synergy predicting method, which includes several Python scriptings to clean and prepare the raw data and convert from the drug's biochemical structure composition to compound fingerprints to use as features. Five Machine Learning algorithms (i.e. Logistic Regression, Random Forest, Support Vector Machine, Ada Boost, and Gradient Boosting) were implemented to build prediction models. Two different biologically validated datasets consisting of 575 antibiotics and 1,054 antimalarials drug combinations were used to test the algorithms implemented. The best prediction models were obtained with the Random Forest algorithm for the antibiotic dataset (0.88 AUC), Logistic Regression for the antimalarial datasets strain Dd2 and HB3 (0.81 and 0.70 AUC, respectively), and Random Forest for the antimalarial datasets strain 3D7 (0.69 AUC). The ML-SyPred tool yielded 45% precision for synergistically predicted antimalarial drug combinations that are annotated and biologically validated, thus confirming the tool’s functionality and applicability. The ML-SyPred tool is available for free use and represents a promising strategy to discover potential drug combinations for further development in novel therapies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.