Joshua Cohen scite author profile

Background: As adolescent suicide rates continue to rise, innovation in risk identification is warranted. Machine learning can identify suicidal individuals based on their language samples. This feasibility pilot was conducted to explore this technology’s use in adolescent therapy sessions and assess machine learning model performance. Method: Natural language processing machine learning models to identify level of suicide risk using a smartphone app were tested in outpatient therapy sessions. Data collection included language samples, depression and suicidality standardized scale scores, and therapist impression of the client’s mental state. Previously developed models were used to predict suicidal risk. Results: 267 interviews were collected from 60 students in eight schools by ten therapists, with 29 students indicating suicide or self-harm risk. During external validation, models were trained on suicidal speech samples collected from two separate studies. We found that support vector machines (AUC: 0.75; 95% CI: 0.69–0.81) and logistic regression (AUC: 0.76; 95% CI: 0.70–0.82) lead to good discriminative ability, with an extreme gradient boosting model performing the best (AUC: 0.78; 95% CI: 0.72–0.84). Conclusion: Voice collection technology and associated procedures can be integrated into mental health therapists’ workflow. Collected language samples could be classified with good discrimination using machine learning methods.

show abstract

Integration and Validation of a Natural Language Processing Machine Learning Suicide Risk Prediction Model Based on Open-Ended Interview Language in the Emergency Department

Cohen¹,

Wright-Berryman

Rohlfs³

et al. 2022

Front. Digit. Health

View full text Add to dashboard Cite

BackgroundEmergency departments (ED) are an important intercept point for identifying suicide risk and connecting patients to care, however, more innovative, person-centered screening tools are needed. Natural language processing (NLP) -based machine learning (ML) techniques have shown promise to assess suicide risk, although whether NLP models perform well in differing geographic regions, at different time periods, or after large-scale events such as the COVID-19 pandemic is unknown.ObjectiveTo evaluate the performance of an NLP/ML suicide risk prediction model on newly collected language from the Southeastern United States using models previously tested on language collected in the Midwestern US.Method37 Suicidal and 33 non-suicidal patients from two EDs were interviewed to test a previously developed suicide risk prediction NLP/ML model. Model performance was evaluated with the area under the receiver operating characteristic curve (AUC) and Brier scores.ResultsNLP/ML models performed with an AUC of 0.81 (95% CI: 0.71–0.91) and Brier score of 0.23.ConclusionThe language-based suicide risk model performed with good discrimination when identifying the language of suicidal patients from a different part of the US and at a later time period than when the model was originally developed and trained.

show abstract

GenotypeTensors: Efficient Neural Network Genotype Callers

Cohen

Simi

Campagne

2018

Preprint

View full text Add to dashboard Cite

We studied the problem of calling genotypes using neural networks. A machine learning approach to calling genotypes requires a training set, an approach to convert genomic sites into tensors and robust model development and evaluation protocols. We discuss each of these components of our approach and compare four types of neural network training protocols, two fully supervised and two semi-supervised approaches. Semi-supervised approaches use unlabeled data to supplement limited quantities of labeled data. Random hyper-parameter searches identified highly performing models that reach indel F1 of 99.4% on a chromosomes 20, 21, 22 and X of NA12878/HG001. We further validate these models by evaluating performance on HG002, an independent sample used in the PrecisionFDA challenge. We apply GenotypeTensors to evaluate the impact of (1) training with small datasets, (2) training models only with sites inside confidence regions, or (3) training with improved true label annotations. A PyTorch open-source implementation of GenotypeTensors is available at https://github.com/CampagneLaboratory/GenotypeTensors. DNANexus cloud applications are provided to help process new datasets both to train model or call genotypes with trained models.Keywords: Deep Learning, Machine Learning, Genotype Caller, High-Throughput Sequencing Recent work showed that careful tuning of baseline architectures can yield state of the art performance compared to more complex architectures [Merity et al., 2018] (authors studied sequence models for natural language processing tasks). This study confirms that hyper-parameter tuning is critical to training state of the art neural network models. In practice, selecting optimal hyper-parameters is difficult because of the computational burden of training many models with different hyper-parameters. In this study, we present and take advantage of an approach that greatly speeds up hyper-parameter searches when the models are small and many models can fit in the memory of a single graphical processing unit (GPU). Figure 1 presents an overview of the process we followed to prepare data for neural network training. Briefly, short reads were aligned to the human genome, alignments were processed with HaplotypeCaller [McKenna et al., 2010] to realign SNPs in the proximity of indels and to reduce the dimensionality of the dataset to regions likely to contain variation. Alignments were converted to a vectorial representation suitable to train a feed-forward neural network. Figure 1 also illustrates the funnel architecture, which allows for interactions of every feature with every other feature and progressively 2/13 RESULTS Data Preparation

show abstract

Virtually screening adults for depression, anxiety, and suicide risk using machine learning and language from an open-ended interview

Wright-Berryman

Cohen²,

Haq³

et al. 2023

Front. Psychiatry

View full text Add to dashboard Cite

BackgroundCurrent depression, anxiety, and suicide screening techniques rely on retrospective patient reported symptoms to standardized scales. A qualitative approach to screening combined with the innovation of natural language processing (NLP) and machine learning (ML) methods have shown promise to enhance person-centeredness while detecting depression, anxiety, and suicide risk from in-the-moment patient language derived from an open-ended brief interview.ObjectiveTo evaluate the performance of NLP/ML models to identify depression, anxiety, and suicide risk from a single 5–10-min semi-structured interview with a large, national sample.MethodTwo thousand four hundred sixteen interviews were conducted with 1,433 participants over a teleconference platform, with 861 (35.6%), 863 (35.7%), and 838 (34.7%) sessions screening positive for depression, anxiety, and suicide risk, respectively. Participants completed an interview over a teleconference platform to collect language about the participants’ feelings and emotional state. Logistic regression (LR), support vector machine (SVM), and extreme gradient boosting (XGB) models were trained for each condition using term frequency-inverse document frequency features from the participants’ language. Models were primarily evaluated with the area under the receiver operating characteristic curve (AUC).ResultsThe best discriminative ability was found when identifying depression with an SVM model (AUC = 0.77; 95% CI = 0.75–0.79), followed by anxiety with an LR model (AUC = 0.74; 95% CI = 0.72–0.76), and an SVM for suicide risk (AUC = 0.70; 95% CI = 0.68–0.72). Model performance was generally best with more severe depression, anxiety, or suicide risk. Performance improved when individuals with lifetime but no suicide risk in the past 3 months were considered controls.ConclusionIt is feasible to use a virtual platform to simultaneously screen for depression, anxiety, and suicide risk using a 5-to-10-min interview. The NLP/ML models performed with good discrimination in the identification of depression, anxiety, and suicide risk. Although the utility of suicide risk classification in clinical settings is still undetermined and suicide risk classification had the lowest performance, the result taken together with the qualitative responses from the interview can better inform clinical decision-making by providing additional drivers associated with suicide risk.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Joshua Cohen

A Feasibility Study Using a Machine Learning Suicide Risk Prediction Model Based on Open-Ended Interview Language in Adolescent Therapy Sessions

Integration and Validation of a Natural Language Processing Machine Learning Suicide Risk Prediction Model Based on Open-Ended Interview Language in the Emergency Department

GenotypeTensors: Efficient Neural Network Genotype Callers

Virtually screening adults for depression, anxiety, and suicide risk using machine learning and language from an open-ended interview

Contact Info

Product

Resources

About