(1) Background: Poor adherence to management behaviors in Chinese Type 2 diabetes mellitus (T2DM) patients leads to an uncontrolled prognosis of diabetes, which results in significant economic costs for China. It is imperative to quickly locate vulnerability factors in the management behavior of patients with T2DM. (2) Methods: In this study, a thematic analysis of the collected interview materials was conducted to construct the themes of T2DM management vulnerability. We explored the applicability of the pre-trained models based on the evaluation metrics in text classification. (3) Results: We constructed 12 themes of vulnerability related to the health and well-being of people with T2DM in Tianjin. We considered that Bidirectional Encoder Representation from Transformers (BERT) performed better in this Natural Language Processing (NLP) task with a shorter completion time. With the splitting ratio of 6:3:1 and batch size of 64 for BERT, the test accuracy was 97.71%, the completion time was 10 min 24 s, and the macro-F1 score was 0.9752. (4) Conclusions: Our results proved the applicability of NLP techniques in this specific Chinese-language medical environment. We filled the knowledge gap in the application of NLP technologies in diabetes management. Our study provided strong support for using NLP techniques to rapidly locate vulnerability factors in T2DM management.
Background: Diabetes has become a global public health priority resulting in significant workforce losses and health care expenditures. Therefore, research on diabetes vulnerability has become imperative. Current methods for studying disease vulnerability mainly use qualitative research methods represented by Thematic Analysis (TCA), which has the disadvantage of being staff-intensive for long periods of time. Natural Language Processing (NLP) could achieve efficient results in information mining tasks, but we didn't find many studies talking about NLP in non-infectious chronic diseases.Methods: In this study, hyperparameters were adjusted to obtain more cost-effective model applicable to The Cities Changing Diabetes’ vulnerability data by comparing Bidirectional Encoder Representation from Transformers (BERT) and Enhanced Language Representation with Informative Entities (ERNIE) in terms of test accuracy, completion time and evaluation metrics on classification.Results: The results showed that BERT took less time for the same hyperparameter cases, and the test accuracy of ERNIE was slightly better than that of BERT. We further adjusted the Batch size of ERNIE as we found that ERNIE with the splitting ratio of 8:1:1 and Batch size of 64 had the better efficiency with the test accuracy was 97.67%, the completion time was 12min36s and Macro-F1 score was 0.9734.Conclusions: In this study, BERT overwhelmed ERNIE in terms of completion speed with the same hyperparameters. ERNIE showed higher accuracy, especially the ideal performance at the split ratio of 8:1:1 after enhancing the Batch size. From the point of view, we pursue a model with high accuracy and fast processing speed, which means that we can obtain the highest accuracy in the shortest time. It could be selected according to the actual situation in the application process.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.