Background
Positive mental health is arguably increasingly important and can be revealed, to some extent, in terms of psychological well-being (PWB). However, PWB is difficult to assess in real time on a large scale. The popularity and proliferation of social media make it possible to sense and monitor online users’ PWB in a nonintrusive way, and the objective of this study is to test the effectiveness of using social media language expression as a predictor of PWB.
Objective
This study aims to investigate the predictive power of social media corresponding to ground truth well-being data in a psychological way.
Methods
We recruited 1427 participants. Their well-being was evaluated using 6 dimensions of PWB. Their posts on social media were collected, and 6 psychological lexicons were used to extract linguistic features. A multiobjective prediction model was then built with the extracted linguistic features as input and PWB as the output. Further, the validity of the prediction model was confirmed by evaluating the model's discriminant validity, convergent validity, and criterion validity. The reliability of the model was also confirmed by evaluating the split-half reliability.
Results
The correlation coefficients between the predicted PWB scores of social media users and the actual scores obtained using the linguistic prediction model of this study were between 0.49 and 0.54 (P<.001), which means that the model had good criterion validity. In terms of the model’s structural validity, it exhibited excellent convergent validity but less than satisfactory discriminant validity. The results also suggested that our model had good split-half reliability levels for every dimension (ranging from 0.65 to 0.85; P<.001).
Conclusions
By confirming the availability and stability of the linguistic prediction model, this study verified the predictability of social media corresponding to ground truth well-being data from the perspective of PWB. Our study has positive implications for the use of social media to predict mental health in nonprofessional settings such as self-testing or a large-scale user study.