UNSTRUCTURED
Background: The objective of this study was to address the prevalent issue of sleep disturbance among college students, which can lead to a range of mental and physical disorders. The identification of potential predictors and the development of an accurate prediction model are essential steps for the early detection of and appropriate intervention in sleep disturbances. However, previous studies have encountered notable limitations.
Objective: This study aimed to provide a fresh perspective by developing and validating a model for the prediction of sleep quality among college students, which will improve the accuracy of predictions and facilitate timely interventions.
Mehods: We analyzed data from 20,645 college students between 5 April and 16 April 2022 in Fujian Province, China.First, the Pittsburgh Sleep Quality Index (PSQI) scale, a self-designed general data questionnaire, and a sleep quality influencing factor questionnaire were conducted among the participants. Second, the collected data were used to select appropriate variables by comparing the outcomes of a multinomial logistic regression, LASSO regression, and Boruta feature selection. The data were then divided into a training–testing set (70%) and an independent validation set (30%) using stratified sampling. We developed and validated six machine learning techniques, which included an artificial neural network, a decision tree, a gradient-boosting tree, a k-nearest neighbor, a naïve Bayes, and a random forest. Finally, an online sleep evaluation website was established based on the best-fitting prediction model.
Results: The mean global PSQI score was 6.02±3.112, and the sleep disturbance rate was 28.9% (defined as a global PSQI score of > 7 points). The LASSO regression model was preferred because it contained only the following eight predictors: age, specialty, respiratory history, coffee consumption, staying up late, long hours online, sudden changes, and impatient closed-loop management. Among the generated models, the artificial neural network (ANN) model was proven to have the best performance, with a cutoff, AUROC, accuracy, sensitivity, specificity, precision, F1-score, and KAPPA of 0.710, 0.713 (95%CI 0.696-0.730), 0.669 (95%CI 0.669-0.669), 0.682 (95%CI 0.699-0.665), 0.637 (95%CI 0.665-0.610), 0.822 (95%CI 0.837-0.807), 0.745 (95%CI 0.729-0.795), and 0.284 (95%CI 0.313-0.255), respectively. In addition, it had a Brier score of 0.182. The calibration curves showed good agreement between the predictions and the observations. A decision curve analysis demonstrated that the model could achieve a net benefit. A clinical impact curve confirmed the high clinical efficiency of the prediction model.
Conclusions:
The prediction model, which incorporated eight predictors, was built using a LASSO regression and an ANN to estimate the probability of sleep disturbance among college students. This model may be utilized as an intuitive and practical tool for sleep quality predictions to support better management and healthcare on college campuses.