Obesity is strongly associated with multiple risk factors. It is significantly contributing to an increased risk of chronic disease morbidity and mortality worldwide. There are various challenges to better understand the association between risk factors and the occurrence of obesity. The traditional regression approach limits analysis to a small number of predictors and imposes assumptions of independence and linearity. Machine Learning (ML) methods are an alternative that provide information with a unique approach to the application stage of data analysis on obesity. This study aims to assess the ability of ML methods, namely Logistic Regression, Classification and Regression Trees (CART), and Naïve Bayes to identify the presence of obesity using publicly available health data, using a novel approach with sophisticated ML methods to predict obesity as an attempt to go beyond traditional prediction models, and to compare the performance of three different methods. Meanwhile, the main objective of this study is to establish a set of risk factors for obesity in adults among the available study variables. Furthermore, we address data imbalance using Synthetic Minority Oversampling Technique (SMOTE) to predict obesity status based on risk factors available in the dataset. This study indicates that the Logistic Regression method shows the highest performance. Nevertheless, kappa coefficients show only moderate concordance between predicted and measured obesity. Location, marital status, age groups, education, sweet drinks, fatty/oily foods, grilled foods, preserved foods, seasoning powders, soft/carbonated drinks, alcoholic drinks, mental emotional disorders, diagnosed hypertension, physical activity, smoking, and fruit and vegetables consumptions are significant in predicting obesity status in adults. Identifying these risk factors could inform health authorities in designing or modifying existing policies for better controlling chronic diseases especially in relation to risk factors associated with obesity. Moreover, applying ML methods on publicly available health data, such as Indonesian Basic Health Research (RISKESDAS) is a promising strategy to fill the gap for a more robust understanding of the associations of multiple risk factors in predicting health outcomes.
Diabetes mellitus (DM) is one of the chronic and deadly diseases that are widely observed in various countries today. This disease continues and is increasing to a very alarming stage. This study aims to identify and see the relationship between factors that influence DM disease. The method used in this research is C4.5 algorithm which is one of the algorithms used to make predictive classifications. Classification is one of the processes in data mining that aims to find patterns in relatively large data that use the representations in the form of decision trees. This method is applied to data from medical records of patients with DM in 2014-2018 taken from the Hasanuddin University Teaching Hospital. The results obtained indicate that there are four factors that influence the prediction of a patient's DM status namely; Fasting Blood Glucose (GDP), LDL Cholesterol, Triglycerides, and Body Weight.
Obesity has become a rising global health problem affecting quality of life for adults. The objective of this study is to describe the prevalence of obesity in Indonesian adults based on the cluster of islands. The study also aims to identify the risk factors of obesity in each island cluster. This study analyzes the secondary data of Indonesian Basic Health Research 2018. Data for this analysis comprised 618,910 adults (≥18 years) randomly selected, proportionate to the population size throughout Indonesia. We included 20 variables for the socio-demographic and obesity-related risk factors for analysis. The obesity status was defined using Body Mass Index (BMI) ≥ 25 kg/m2. Our current study defines 7 major island clusters as the unit analysis consisting of 34 provinces in Indonesia. Descriptive analysis was conducted to determine the characteristics of the population and to calculate the prevalence of obesity within the provinces in each of the island clusters. Multivariate logistic regression analyses to calculate the odds ratios (ORs) was performed using R version 3.6.3. The study results show that all the island clusters have at least one province with an obesity prevalence above the national prevalence (35.4%). Six out of twenty variables, comprising four dietary factors (the consumption of sweet food, high-salt food, meat, and carbonated drinks) and one psychological factor (mental health disorders), varied across the island clusters. In conclusion, there was a variation of obesity prevalence of the provinces within and between island clusters. The variation of risk factors found in each island cluster suggests that a government rethink of the current intervention strategies to address obesity is recommended.
The accuracy of the data class is very important in classification with a machine learning approach. The more accurate the existing data sets and classes, the better the output generated by machine learning. In fact, classification can experience imbalance class data in which each class does not have the same portion of the data set it has. The existence of data imbalance will affect the classification accuracy. One of the easiest ways to correct imbalanced data classes is to balance it. This study aims to explore the problem of data class imbalance in the medium case dataset and to address the imbalance of data classes as well. The Synthetic Minority Over-Sampling Technique (SMOTE) method is used to overcome the problem of class imbalance in obesity status in Indonesia 2013 Basic Health Research (RISKESDAS). The results show that the number of obese class (13.9%) and non-obese class (84.6%). This means that there is an imbalance in the data class with moderate criteria. Moreover, SMOTE with over-sampling 600% can improve the level of minor classes (obesity). As consequence, the classes of obesity status balanced. Therefore, SMOTE technique was better compared to without SMOTE in exploring the obesity status of Indonesia RISKESDAS 2013.
The spatial variation of type 2 diabetes mellitus (T2DM) and hypertension and their potential linkage were explored in South Sulawesi Province, Indonesia. The Global Moran’s I and regression analysis were utilized to identify the characteristics involved. The methods were performed based on T2DM and hypertension data from 2017 and 2018 acquired from Social Health Insurance Administration in Indonesia. The spatial variation of T2DM and hypertension showed that the prevalence rate of T2DM and hypertension tends to occur randomly (p = 0.678, p = 0.711, respectively). By utilizing Generalized Poisson Regression Analysis, our study showed a significant relationship between T2DM and hypertension (p ≤ 0.001). This research could help policy makers to plan and support projects with the aim of overcoming the risk of T2DM and hypertension.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.