BackgroundUse of the social media website Twitter is highly prevalent and has led to a plethora of Web-based social and health-related data available for use by researchers. As such, researchers are increasingly using data from social media to retrieve and analyze mental health-related content. However, there is limited evidence regarding why people use this emerging platform to discuss mental health problems in the first place.ObjectivesThe aim of this study was to explore the reasons why individuals discuss mental health on the social media website Twitter. The study was the first of its kind to implement a study-specific hashtag for research; therefore, we also examined how feasible it was to circulate and analyze a study-specific hashtag for mental health research.MethodsText mining methods using the Twitter Streaming Application Programming Interface (API) and Twitter Search API were used to collect and organize tweets from the hashtag #WhyWeTweetMH, circulated between September 2015 and November 2015. Tweets were analyzed thematically to understand the key reasons for discussing mental health using the Twitter platform.ResultsFour overarching themes were derived from the 132 tweets collected: (1) sense of community; (2) raising awareness and combatting stigma; (3) safe space for expression; and (4) coping and empowerment. In addition, 11 associated subthemes were also identified.ConclusionsThe themes derived from the content of the tweets highlight the perceived therapeutic benefits of Twitter through the provision of support and information and the potential for self-management strategies. The ability to use Twitter to combat stigma and raise awareness of mental health problems indicates the societal benefits that can be facilitated via the platform. The number of tweets and themes identified demonstrates the feasibility of implementing study-specific hashtags to explore research questions in the field of mental health and can be used as a basis for other health-related research.
ObjectiveWe executed the Social Media Mining for Health (SMM4H) 2017 shared tasks to enable the community-driven development and large-scale evaluation of automatic text processing methods for the classification and normalization of health-related text from social media. An additional objective was to publicly release manually annotated data.Materials and MethodsWe organized 3 independent subtasks: automatic classification of self-reports of 1) adverse drug reactions (ADRs) and 2) medication consumption, from medication-mentioning tweets, and 3) normalization of ADR expressions. Training data consisted of 15 717 annotated tweets for (1), 10 260 for (2), and 6650 ADR phrases and identifiers for (3); and exhibited typical properties of social-media-based health-related texts. Systems were evaluated using 9961, 7513, and 2500 instances for the 3 subtasks, respectively. We evaluated performances of classes of methods and ensembles of system combinations following the shared tasks.ResultsAmong 55 system runs, the best system scores for the 3 subtasks were 0.435 (ADR class F1-score) for subtask-1, 0.693 (micro-averaged F1-score over two classes) for subtask-2, and 88.5% (accuracy) for subtask-3. Ensembles of system combinations obtained best scores of 0.476, 0.702, and 88.7%, outperforming individual systems.DiscussionAmong individual systems, support vector machines and convolutional neural networks showed high performance. Performance gains achieved by ensembles of system combinations suggest that such strategies may be suitable for operational systems relying on difficult text classification tasks (eg, subtask-1).ConclusionsData imbalance and lack of context remain challenges for natural language processing of social media text. Annotated data from the shared task have been made available as reference standards for future studies (http://dx.doi.org/10.17632/rxwfb3tysd.1).
In recent years, social media websites have been suggested as a novel, vast source of data which may be useful for deriving drug safety information. Despite this, there are few published reports of drug safety profiles derived in this way. The aims of this study were to detect and quantify glucocorticoid-related adverse events using a computerised system for automated detection of suspected adverse drug reactions (ADR) from narrative text in Twitter, and to compare the frequency of specific ADR mentions within Twitter to the frequency and patterns of spontaneous ADR reporting to a national drug regulatory body. Of 159,297 tweets mentioning either prednisolone or prednisone between 1st October 2012 and 30th June 2015, 20,206 tweets were deemed to contain information resembling an ADR. The top AE MedDRA® Preferred Terms were 'insomnia' and 'weight increased', both recognised non-serious but common side effects. These were proportionally over-reported in Twitter when compared to spontaneous reports in the UK regulator's ADR reporting scheme. Serious glucocorticoid related AEs were reported less frequently. Pharmacovigilance using Twitter data has the potential to be a valuable, supplementary source of drug safety information. In particular, it can illustrate which drug side effects patients discuss most commonly, potentially because of important impacts on quality of life. This information could help clinicians to inform patients about frequent and relevant non-serious side effects as well as more serious side effects.
The medical concept normalisation task aims to map textual descriptions to standard terminologies such as SNOMED-CT or MedDRA. Existing publicly available datasets annotated using different terminologies cannot be simply merged and utilised, and therefore become less valuable when developing machine learningbased concept normalisation systems. To address that, we designed a data harmonisation pipeline and engineered a corpus of 27,979 textual descriptions simultaneously mapped to both MedDRA and SNOMED-CT, sourced from five publicly available datasets across biomedical and social media domains. The pipeline can be used in the future to integrate new datasets into the corpus and also could be applied in relevant data curation tasks. We also described a method to merge different terminologies into a single concept graph preserving their relations and demonstrated that representation learning approach based on random walks on a graph can efficiently encode both hierarchical and equivalent relations and capture semantic similarities not only between concepts inside a given terminology but also between concepts from different terminologies. We believe that making a corpus and embeddings for cross-terminology medical concept normalisation available to the research community would contribute to a better understanding of the task.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.