BackgroundDiabetes is a chronic disease that affects millions of people worldwide. It is therefore unsurprising that there is a high volume of public discussions, resources, and research tackling various aspects of the disease. Over the last decade, more than hundred thousand research articles have been published by researchers and countless of online discussions have taken place on various online platforms. This study is an attempt to identify the areas of public interest, related to diabetes, by looking at online discussion forums and to evaluate their relationship to pages about diabetes found on Wikipedia and to the academic research about the topic. The main aim is to investigate the extent to which researchers are responding to the public's interests and concerns, and to the level of uptake of the research topics in the public sphere.
Methodology/Principal findingsTo detect public interests and concerns in diabetes, we collected posts on a popular diabetes discussion forum (DiabeticConnect) and pages (articles) about diabetes published in Wikipedia. We also downloaded the titles and abstracts of research articles about diabetes from the Scopus database, all between 2008 and 2016. Tags assigned to each post in the discussion forum were used along with the post itself to compute a Labeled Latent Dirichlet Allocation (LLDA) model, which was then used to classify the Wikipedia pages and research articles. The resulting classifications were then used to compare the prevalence of the topics found in the discussion forum with those of the other two sources. The results show that while research articles and Wikipedia pages about diabetes focus on diabetes testing, treatments, and disease control, the public forum discussions focus on Type 2 diabetes, emotional support, and proper diet for diabetic patients. However, for some other topics there was an alignment in the relative rise and fall of interest across the three platforms.
Conclusions/SignificanceThe alignment and misalignment in the changes of relative interest over the various topics is evidence that the LLDA modelling can be useful for comparing a public corpus, like a diabetes forum, and an academic one, like research titles and abstracts. The success of using LLDA to classify research articles based on the tags assigned to posts in a public discussion forum shows that this a promising method for better understanding how the scientific community responds to public interests and needs, and, on the flip side, how the public takes up the language and topics discussed by the academic community.