This paper presents an investigation of topic modeling in embedding spaces performances in the context of depression assessment. Using the textual content of social media users from the eRisk 2018 dataset, a classification task is performed employing features generated from the Embedded Topic Model. To set contrast with traditional topic modeling, a full comparison with the Latent Dirichlet Allocation model is shown. An extensive range of topics and different preprocessing strategies are studied to demonstrate the efficiency of the models. Our results show a noteworthy improvement in the explored task from the application of the novel topic modeling approach.
Deep Averaging Networks (DANs) show strong performance in several key Natural Language Processing (NLP) tasks. However, their chief drawback is not accounting for the position of tokens when encoding sequences. We study how existing position encodings might be integrated into the DAN architecture. In addition, we propose a novel position encoding built specifically for DANs, which allows greater generalization capabilities to unseen lengths of sequences. This is demonstrated on decision tasks on binary sequences. Further, the resulting architecture is compared against unordered aggregation on sentiment analysis both with word-and character-level tokenization, to mixed results.
This work proposes an approach to predict potential answers to the Beck Depression Inventory-Second Edition (BDI-II), a 21-item self-report inventory measuring the severity of depression in adolescents and adults. Predictions are based on similarity measures between the textual productions of social media users and completed BDI-IIs. Two methods of establishing similarity are compared. The first one is using unsupervised extraction of topics, and the second one is based on authorship attribution through the use of neural encoders. Both approaches achieve interesting results, indicating that the authorship attribution task can induce a similarity measure useful for depression symptom detection. The issues that arise in predicting several aspects of depression are further discussed.
In this paper, we explore topic modeling for the assessment of risk for depression, anorexia and self-harm. Using social media textual content from different datasets, we focus on Latent Dirichlet Allocation models, trained on both specific and combined corpora made from these datasets to perform risk detection. We investigate mental health vocabulary and shared topic modeling performance improvements on user classification.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.