Introduction: News media play an important role in raising public awareness, framing public opinions, affecting policy formulation, and acknowledgment of public health issues. Traditional qualitative content analysis for news sentiments and focuses are time-consuming and may not efficiently convey sentiments nor the focuses of news media. Methods: We used descriptive statistics and state-of-art text mining to conduct sentiment analysis and topic modeling, to efficiently analyze over 3 million Reuters news articles during 2007–2017 for identifying their coverage, sentiments, and focuses for public health issues. Based on the top keywords from public health scientific journals, we identified 10 major public health issues (i.e., “air pollution,” “alcohol drinking,” “asthma,” “depression,” “diet,” “exercise,” “obesity,” “pregnancy,” “sexual behavior,” and “smoking”). Results: The news coverage for seven public health issues, “Smoking,” “Exercise,” “Alcohol drinking,” “Diet,” “Obesity,” “Depression,” and “Asthma” decreased over time. The news coverage for “Sexual behavior,” “Pregnancy,” and “Air pollution” fluctuated during 2007–2017. The sentiments of the news articles for three of the public health issues, “exercise,” “alcohol drinking,” and “diet” were predominately positive and associated such as “energy.” Sentiments for the remaining seven public health issues were mainly negative, linked to negative terms, e.g., diseases. The results of topic modeling reflected the media’s focus on public health issues. Conclusions: Text mining methods may address the limitations of traditional qualitative approaches. Using big data to understand public health needs is a novel approach that could help clinical and translational science awards programs focus on community-engaged research efforts to address community priorities.
Background The aging population has led to an increase in cognitive impairment (CI) resulting in significant costs to patients, their families, and society. A research endeavor on a large cohort to better understand the frequency and severity of CI is urgent to respond to the health needs of this population. However, little is known about temporal trends of patient health functions (i.e., activity of daily living [ADL]) and how these trends are associated with the onset of CI in elderly patients. Also, the use of a rich source of clinical free text in electronic health records (EHRs) to facilitate CI research has not been well explored. The aim of this study is to characterize and better understand early signals of elderly patient CI by examining temporal trends of patient ADL and analyzing topics of patient medical conditions in clinical free text using topic models. Methods The study cohort consists of physician-diagnosed CI patients ( n = 1,435) and cognitively unimpaired (CU) patients ( n = 1,435) matched by age and sex, selected from patients 65 years of age or older at the time of enrollment in the Mayo Clinic Biobank. A corpus analysis was performed to examine the basic statistics of event types and practice settings where the physician first diagnosed CI. We analyzed the distribution of ADL in three different age groups over time before the development of CI. Furthermore, we applied three different topic modeling approaches on clinical free text to examine how patients’ medical conditions change over time when they were close to CI diagnosis. Results The trajectories of ADL deterioration became steeper in CI patients than CU patients approximately 1 to 1.5 year(s) before the actual physician diagnosis of CI. The topic modeling showed that the topic terms were mostly correlated and captured the underlying semantics relevant to CI when approaching to CI diagnosis. Conclusions There exist notable differences in temporal trends of basic and instrumental ADL between CI and CU patients. The trajectories of certain individual ADL, such as bathing and responsibility of own medication, were closely associated with CI development. The topic terms obtained by topic modeling methods from clinical free text have a potential to show how CI patients’ conditions evolve and reveal overlooked conditions when they close to CI diagnosis.
Multidimensional databases and OLAP tools that provide an efficient framework for data mining have been pushing us to the OLAM architecture. OLAP is widely used to illustrate meaningful and interactive analysis of data on the complex structure. In contrast, detecting hidden patterns in the data and exploring them is for the data mining. OLAP and data mining are believed to complete each other for analyzing large data sets in decision support systems efficiently. Unlike previous work in this field, this method does not rely on the availability of knowledge in a particular field. Variables will be selected with the consideration of user to build cubes. Hierarchical clustering is used to obtain dynamic relationships between variables at different levels of data. Results of the Adult data set shows that the obtained Lift from Fuzzy AprioriTid compared with Apriori algorithm increased.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.