Postpartum depression is a serious health issue beyond the mental health problems that affect mothers after childbirth. There are no predictive tools available to screen postpartum depression that also allow early interventions. We aimed to develop predictive models for postpartum depression using machine learning (ML) approaches. We performed a retrospective cohort study using data from the Pregnancy Risk Assessment Monitoring System 2012–2013 with 28,755 records (3339 postpartum depression and 25,416 normal cases). The imbalance between the two groups was addressed by a balanced resampling using both random down-sampling and the synthetic minority over-sampling technique. Nine different ML algorithms, including random forest (RF), stochastic gradient boosting, support vector machines (SVM), recursive partitioning and regression trees, naïve Bayes, k-nearest neighbor (kNN), logistic regression, and neural network, were employed with 10-fold cross-validation to evaluate the models. The overall classification accuracies of the nine models ranged from 0.650 (kNN) to 0.791 (RF). The RF method achieved the highest area under the receiver-operating-characteristic curve (AUC) value of 0.884, followed by SVM, which achieved the second-best performance with an AUC value of 0.864. Predictive modeling developed using ML-approaches may thus be used as a prediction (screening) tool for postpartum depression in future studies.
A major challenge in drug development is safety and toxicity concerns due to drug side effects. One such side effect, drug-induced liver injury (DILI), is considered a primary factor in regulatory clearance. The Critical Assessment of Massive Data Analysis (CAMDA) 2020 CMap Drug Safety Challenge goal was to develop prediction models based on gene perturbation of six preselected cell-lines (CMap L1000), extended structural information (MOLD2), toxicity data (TOX21), and FDA reporting of adverse events (FAERS). Four types of DILI classes were targeted, including two clinically relevant scores and two control classifications, designed by the CAMDA organizers. The L1000 gene expression data had variable drug coverage across cell lines with only 247 out of 617 drugs in the study measured in all six cell types. We addressed this coverage issue by using Kru-Bor ranked merging to generate a singular drug expression signature across all six cell lines. These merged signatures were then narrowed down to the top and bottom 100, 250, 500, or 1,000 genes most perturbed by drug treatment. These signatures were subject to feature selection using Fisher’s exact test to identify genes predictive of DILI status. Models based solely on expression signatures had varying results for clinical DILI subtypes with an accuracy ranging from 0.49 to 0.67 and Matthews Correlation Coefficient (MCC) values ranging from -0.03 to 0.1. Models built using FAERS, MOLD2, and TOX21 also had similar results in predicting clinical DILI scores with accuracy ranging from 0.56 to 0.67 with MCC scores ranging from 0.12 to 0.36. To incorporate these various data types with expression-based models, we utilized soft, hard, and weighted ensemble voting methods using the top three performing models for each DILI classification. These voting models achieved a balanced accuracy up to 0.54 and 0.60 for the clinically relevant DILI subtypes. Overall, from our experiment, traditional machine learning approaches may not be optimal as a classification method for the current data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.