Background
Machine learning investigates how computers can automatically learn. The present study aimed to predict dietary patterns and compare algorithm performance in making predictions of dietary patterns.
Methods
We analysed the data of public employees (n = 12,667) participating in the Brazilian Longitudinal Study of Adult Health (ELSA‐Brasil). The K‐means clustering algorithm and six other classifiers (support vector machines, naïve Bayes, K‐nearest neighbours, decision tree, random forest and xgboost) were used to predict the dietary patterns.
Results
K‐means clustering identified two dietary patterns. Cluster 1, labelled the Western pattern, was characterised by a higher energy intake and consumption of refined cereals, beans and other legumes, tubers, pasta, processed and red meats, high‐fat milk and dairy products, and sugary beverages; Cluster 2, labelled the Prudent pattern, was characterised by higher intakes of fruit, vegetables, whole cereals, white meats, and milk and reduced‐fat milk derivatives. The most important predictors were age, sex, per capita income, education level and physical activity. The accuracy of the models varied from moderate to good (69%–72%).
Conclusions
The performance of the algorithms in dietary pattern prediction was similar, and the models presented may provide support in screener tasks and guide health professionals in the analysis of dietary data.