Background
The COVID-19 pandemic has changed the usual working of many hospitalization units (or wards). Few studies have used electronic nursing clinical notes (ENCN) and their unstructured text to identify alterations in patients' feelings and therapeutic procedures of interest.
Objective
This study aimed to analyze positive or negative sentiments through inspection of the free text of the ENCN, compare sentiments of ENCN with or without hospitalized patients with COVID-19, carry out temporal analysis of the sentiments of the patients during the start of the first wave of the COVID-19 pandemic, and identify the topics in ENCN.
Methods
This is a descriptive study with analysis of the text content of ENCN. All ENCNs between January and June 2020 at Guadarrama Hospital (Madrid, Spain) extracted from the CGM Selene Electronic Health Records System were included. Two groups of ENCNs were analyzed: one from hospitalized patients in post–intensive care units for COVID-19 and a second group from hospitalized patients without COVID-19. A sentiment analysis was performed on the lemmatized text, using the National Research Council of Canada, Affin, and Bing dictionaries. A polarity analysis of the sentences was performed using the Bing dictionary, SO Dictionaries V1.11, and Spa dictionary as amplifiers and decrementators. Machine learning techniques were applied to evaluate the presence of significant differences in the ENCN in groups of patients with and those without COVID-19. Finally, a structural analysis of thematic models was performed to study the abstract topics that occur in the ENCN, using Latent Dirichlet Allocation topic modeling.
Results
A total of 37,564 electronic health records were analyzed. Sentiment analysis in ENCN showed that patients with subacute COVID-19 have a higher proportion of positive sentiments than those without COVID-19. Also, there are significant differences in polarity between both groups (Z=5.532, P<.001) with a polarity of 0.108 (SD 0.299) in patients with COVID-19 versus that of 0.09 (SD 0.301) in those without COVID-19. Machine learning modeling reported that despite all models presenting high values, it is the neural network that presents the best indicators (>0.8) and with significant P values between both groups. Through Structural Topic Modeling analysis, the final model containing 10 topics was selected. High correlations were noted among topics 2, 5, and 8 (pressure ulcer and pharmacotherapy treatment), topics 1, 4, 7, and 9 (incidences related to fever and well-being state, and baseline oxygen saturation) and topics 3 and 10 (blood glucose level and pain).
Conclusions
The ENCN may help in the development and implementation of more effective programs, which allows patients with COVID-19 to adopt to their prepandemic lifestyle faster. Topic modeling could help identify specific clinical problems in patients and better target the care they receive.