BACKGROUND
Advances in artificial intelligence (AI) technology have raised new possibilities for the effective evaluation of daily dietary intake, but more empirical study is needed for the use of such technologies under realistic meal scenarios. This study developed an automated food recognition technology, which was then integrated into its previous design to improve usability for meal reporting. The newly developed app allowed for the automatic detection and recognition of multiple dishes within a single real-time food image as input. Application performance was tested using young adults in authentic dining conditions.
OBJECTIVE
A two-group comparative study was conducted to assess app performance using metrics including accuracy, efficiency, and user perception. The experimental group, named Automatic Image-based Reporting (AIR) group, was compared against a control group using the previous version, named the Voice Input Reporting (VIR) group. Each application is primarily designed to facilitate a distinct method to food intake reporting. AIR users capture and upload images of their selected dishes, supplemented with voice commands where appropriate. VIR users supplement the uploaded image with verbal inputs for food names and attributes.
METHODS
The two mobile apps were subjected to a head-to-head parallel randomized evaluation. A cohort of 42 young adults aged 20-25 years (9 male and 34 female) was recruited from a university in Taiwan and randomly assigned to two groups, i.e., AIR (n=22) and VIR (n=20). Both groups were assessed using the same menu of seventeen dishes. Each meal was designed to represent a typical lunch or dinner setting, with one stable, one main course, and three side dishes. All participants used the app on the same type of smartphone, with the interfaces of both using uniform user interactions, icons, and layouts. Analysis of the gathered data focused on assessing reporting accuracy, time efficiency, and user perception.
RESULTS
For the AIR group, 86% dishes were correctly identified, whereas 68% dishes were accurately reported. The AIR group exhibited a significantly higher degree of identification accuracy compared to the VIR group (p<.001). The AIR group also required significantly less time to complete food reporting (p<.001). SUS scores showed both apps were perceived as having high usability and learnability (P=.20).
CONCLUSIONS
The AIR group outperformed the VIR group in terms of accuracy and time efficiency for overall dish reporting within the meal testing scenario. While further technological enhancement may be required, the integration of AI vision technology into existing mobile applications holds promise. Our results provide an evidence-based contributing for the integration of automatic image recognition technology into existing apps in terms of user interaction efficacy and overall ease of use. Further empirical work is required including full-scale randomized controlled trials and assessments of user perception under a range of dining conditions.
CLINICALTRIAL
International Standard Randomized Trial Registry ISRCTN27511195; https://doi.org/10.1186/ISRCTN27511195