Background
Internet and social media platforms offer insights into the lived experiences of survivors of cancer and their caregivers; however, the volume of narrative data available is often cumbersome for thorough analysis. Survivors of gynecological cancer have unique needs, such as those related to a genetic predisposition to future cancers, impact of cancer on sexual health, the advanced stage at which many are diagnosed, and the influx of new therapeutic approaches.
Objective
This study aimed to present a unique methodology to leverage large amounts of data from internet-based platforms for mixed methods analysis. We analyzed discussion board posts made by survivors of gynecological cancer on the American Cancer Society website with a particular interest in evaluating the psychosocial aspects of survivorship.
Methods
All posts from the ovarian, uterine, and gynecological cancers (other than ovarian and uterine) discussion boards on the American Cancer Society Cancer Survivors Network were included. Posts were web scraped using Python and organized by psychosocial themes described in the Quality of Cancer Survivorship Care Framework. Keywords related to each theme were generated and verified. Keywords identified posts related to the predetermined psychosocial themes. Quantitative analysis was completed using Python and R Foundation for Statistical Computing packages. Qualitative analysis was completed on a subset of posts as a proof of concept. Themes discovered through latent Dirichlet allocation (LDA), an unsupervised topic modeling technique, were assessed and compared with the predetermined themes of interest.
Results
A total of 125,498 posts made by 6436 survivors of gynecological cancer and caregivers between July 2000 and February 2020 were evaluated. Of the 125,489 posts, 23,458 (18.69%) were related to the psychosocial experience of cancer and were included in the mixed methods psychosocial analysis. Quantitative analysis (23,458 posts) revealed that survivors across all gynecological cancer discussion boards most frequently discussed the role of friends and family in care, as well as fatigue, the effect of cancer on interpersonal relationships, and health insurance status. Words related to psychosocial aspects of survivorship most often used in posts included “family,” “hope,” and “help.” Qualitative analysis (20 of the 23,458 posts) similarly demonstrated that survivors frequently discussed coping strategies, distress and worry, the role of family and caregivers in their cancer care, and the toll of managing financial and insurance concerns. Using LDA, we discovered 8 themes, none of which were directly related to psychosocial aspects of survivorship. Of the 56 keywords identified by LDA, 2 (4%), “sleep” and “work,” were included in the keyword list that we independently devised.
Conclusions
Web-based discussion platforms offer a great opportunity to learn about patient experiences of survivorship. Our novel methodology expedites the quantitative and qualitative analyses of such robust data, which may be used for additional patient populations.