BACKGROUND
Uric acid is associated with Non-communicable diseases (NCDs), such as cardiovascular diseases, chronic kidney disease, coronary artery disease, stroke, diabetes, metabolic syndrome, vascular dementia, and hypertension. Therefore, uric acid is considered to be a risk factor for the development of NCDs. Most studies on uric acid have been performed in developed countries. To our knowledge, the application of machine learning approaches in uric acid prediction in developing countries is rare. Different ML algorithms will work differently on different types of data in various diseases such as cancers, diabetes, therefore, a different investigation is needed for different types of data in order to identify the most accurate algorithms. Specifically, yet, no study focused on the urban corporate people in Bangladesh, though they are more likely to develop NCDs.
OBJECTIVE
The aim of this study is to use machine learning approaches to predict blood uric acid based on basic health checkup test results, dietary information, and socio-demographic characteristics. The prediction of health checkup test measurements is very helpful to reduce health management costs.
METHODS
This study used machine learning approaches because clinical input data are not completely independent and complex interactions exist between them. Conventional statistical models have limitations to consider these complex interactions but ML can consider all possible interactions between input data. This study used several machine learning approaches such as Boosted Decision Tree Regression, Decision Forest Regression, Bayesian Linear Regression, and Linear Regression to predict personalized blood uric acid based on basic health check-up test results, dietary information, and socio-demographic characteristics. We evaluated the performance of these five widely used machine learning models. Data have been collected from 271 employees, who work in the Grameen Bank complex, Dhaka, Bangladesh.
RESULTS
The mean of uric acid measurement was 6.62 mg/dL. That means the uric acid of most of the people is on the borderline (6.62 whereas the normal range <7.0 mg/dL). Therefore, they need to check uric acid regularly. The Boosted Decision Tree Regression model showed the best performance among other models based on the Root Mean Squared Error (RMSE) 0.03, this RMSE is better than any reported in the literature.
CONCLUSIONS
This study developed a uric acid prediction model based on personal characteristics, dietary information and some basic health checkup measurements. Such a uric acid prediction model is useful for improving awareness among high-risk subjects. By predicting uric acid, this study can help to save medical costs. A future study could include additional features (e.g. work stress, everyday physical activity, alcohol intake, eating red meat).
CLINICALTRIAL
The authors obtained ethical approval from the National Research Ethics Committee (NREC) of the Bangladesh Medical Research Council with approval no. 18325022019.