Background and objective
Non-suicidal self-injury (NSSI) is a psychological disorder that the sufferer consciously damages their body tissues, often too severe that requires intensive care medicine. As some individuals hide their NSSI behaviors, other people can only identify them if they catch them while injuring, or via dedicated questionnaires. However, questionnaires are long and tedious to answer, thus the answers might be inconsistent. Hence, in this study for the first time, we abstracted a larger questionnaire (of 662 items in total) to own only 22 items (questions) via data mining techniques. Then, we trained several machine learning algorithms to classify individuals based on their answers into two classes.
Methods
Data from 277 previously-questioned participants is used in several data mining methods to select features (questions) that highly represent NSSI, then 245 different people were asked to participate in an online test to validate those features via machine learning methods.
Results
The highest accuracy and F1 score of the selected features–via the Genetics algorithm–are 80.0% and 74.8% respectively for a Random Forest algorithm. Cronbach’s alpha of the online test (validation on the selected features) is 0.82. Moreover, results suggest that an MLP can classify participants into two classes of NSSI Positive and NSSI Negative with 83.6% accuracy and 83.7% F1-score based on the answers to only 22 questions.
Conclusion
While previously psychologists used many combined questionnaires to see whether someone is involved in NSSI, via various data mining methods, the present study showed that only 22 questions are enough to predict if someone is involved or not. Then different machine learning algorithms were utilized to classify participants based on their NSSI behaviors, among which, an MLP with 10 hidden layers had the best performance.