In this paper, we experimented with a set of machine-learning classifiers for predicting the risk of a person being overweight or obese, taking into account his/her dietary habits and socioeconomic information. We investigate with ten different machine-learning algorithms combined with four feature-selection strategies (two evolutionary feature-selection methods, one feature selection from the literature, and no feature selection). We tackle the problem under a binary classification approach with evolutionary feature selection. In particular, we use a genetic algorithm to select the set of variables (features) that optimize the accuracy of the classifiers. As an additional contribution, we designed a variant of the Stud GA, a particular structure of the selection operator of individuals where a reduced set of elitist solutions dominate the process. The genetic algorithm uses a direct binary encoding, allowing a more efficient evaluation of the individuals. We use a dataset with information from more than 1170 people in the Spanish Region of Madrid. Both evolutionary and classical feature-selection methods were successfully applied to Gradient Boosting and Decision Tree algorithms, reaching values up to 79% and increasing the average accuracy by two points, respectively.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.