Purpose
Reveal and quantify qualities of reported experiences of chronic pain on social media, from multiple pathological backgrounds, by means of the novel Reddit Reports of Chronic Pain (RRCP) dataset, using Natural Language Processing techniques.
Methods
Define and validate the RRCP dataset for a set of subreddits related to chronic pain. Identify the main concerns discussed in each subreddit. Model each subreddit according to their main concerns. Compare subreddit models.
Results
The RRCP dataset comprises 86,537 Reddit submissions from 12 subreddits related to chronic pain (each related to one pathological background). Each RRCP subreddit has various main concerns. Some of these concerns are shared between multiple subreddits (e.g., the subreddit Sciatica semantically entails the subreddit backpain in their various concerns, but not the other way around), whilst some concerns are exclusive to specific subreddits (e.g., Interstitialcystitis and CrohnsDisease).
Conclusion
These results suggest that the reported experience of chronic pain, from multiple pathologies (i.e., subreddits), has concerns relevant to all, and concerns exclusive to certain pathologies. Our analysis details each of these concerns and their similarity relations. Although limited by intrinsic qualities of the Reddit platform, to the best of our knowledge, this is the first research work attempting to model the linguistic expression of various chronic pain-inducing pathologies and comparing these models to identify and quantify the similarities and differences between the corresponding emergent chronic pain experiences.