Anorexia Nervosa (AN) is a serious mental disorder that has been proved to be traceable on social media through the analysis of users' written posts. Here we present an approach to generate word embeddings enhanced for a classification task dedicated to the detection of Reddit users with AN. Our method extends Word2vec's objective function in order to put closer domain-specific and semantically related words. The approach is evaluated through the calculation of an average similarity measure, and via the usage of the embeddings generated as features for the AN screening task. The results show that our method outperforms the usage of fine-tuned pre-learned word embeddings, related methods dedicated to generate domain adapted embeddings, as well as representations learned on the training set using Word2vec. This method can potentially be applied and evaluated on similar tasks that can be formalized as document categorization problems. Regarding our use case, we believe that this approach can contribute to the development of proper automated detection tools to alert and assist clinicians.
This paper proposes an approach for the early detection of anorexia nervosa (AN) on social media. We present a machine learning approach that processes the texts written by social media users. This method relies on a set of features based on domain-specific vocabulary, topics, psychological processes, and linguistic information extracted from the users' writings. This approach penalizes the delay in detecting positive cases in order to classify the users in risk as early as possible. Identifying anorexia early, along with an appropriate treatment, improves the speed of recovery and the likelihood of staying free of the illness. The results of this work showed that our proposal is suitable for the early detection of AN symptoms.
Substance abuse and mental health issues are severe conditions that affect millions. Signs of certain conditions have been traced on social media through the analysis of posts. In this paper we analyze textual cues that characterize and differentiate Reddit posts related to depression, eating disorders, suicidal ideation, and alcoholism, along with control posts. We also generate enhanced word embeddings for binary and multi-class classification tasks dedicated to the detection of these types of posts. Our enhancement method to generate word embeddings focuses on identifying terms that are predictive for a class and aims to move their vector representations close to each other, while moving them away from the vectors of terms that are predictive for other classes. Variations of the embeddings are defined and evaluated trough predictive tasks, a cosine similarity-based method, and a visual approach. We generate predictive models using variations of our enhanced representations with statistical and deep learning approaches. We also propose a method that leverages the properties of the enhanced embeddings in order to build features for predictive models. Results show that variations of our enhanced representations outperform in Recall, Accuracy and F1-Score the embeddings learned with Word2vec, DistilBERT, GloVe's fine-tuned pre-learned embeddings and other methods based on domain adapted embeddings. The approach presented has the potential to be used on similar binary or multi-class classification tasks that deal with small domain-specific textual corpora.
BACKGROUND Eating disorders are psychological conditions characterized by unhealthy eating habits. Anorexia Nervosa (AN) is defined by the thought of being overweight despite being dangerously underweight. Psychological signs involve emotional and behavioral issues. There is evidence that signs and symptoms can be manifested on social media, where both harmful and beneficial content is shared daily. OBJECTIVE The aim of this work is to characterize Spanish speaking users with Anorexia signs on Twitter through the extraction and inference of behavioral, demographical, relational, and multi-modal data. This analysis is focused on characterizing and comparing users at different stages of the process to overcome the illness, including treatment and full recovery periods considering the Transtheoretical Model of Health Behavior Change (TTM). METHODS We analyze tweets published by users going through different stages of Anorexia. Users are characterized through their writings, posting patterns, relations, and images. We analyze the differences among users going through each stage of the illness and control users (users not suffering from AN). We also analyze the topics of interest of their followees (users followed by them). We perform a clustering approach to distinguish users at an early phase of the illness (precontemplation) from users that recognize that their behavior is problematic (contemplation); and generate models dedicated to the detection of tweets and images related to AN. We consider two types of control users: focused control users that use terms related to anorexia; and random control users. RESULTS We found significant differences between users at each stage of the recovery process (P<.001) and control groups. Users with AN tend to tweet more at night, with a median sleep period tweeting ratio of 0.05 in comparison to random control users (0.04) and focused control users (0.03). Pictures are relevant for the characterization of users. Focused and random control users are characterized by the usage of text on their profile pictures. We also found a strong polarization between focused control users, and users at the first stages of the disorder. There was a strong correlation (Spearman’s coefficient) among the shared interest between users with AN and their followees (0.96). Also, the interests of recovered users and users in treatment were more highly correlated to those corresponding to the focused control group (0.87 for both) in comparison to AN’s users (0.67), suggesting a shift on users’ interest during the recovery process. CONCLUSIONS We have mapped signs of Anorexia Nervosa to the Social media context. These results enforce the findings of related work on other languages and involve a deep analysis on the topics of interest of users at each phase of the disorder. The features and patterns identified provide a basis for the development of detection tools and recommender systems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.