Search citation statements
Paper Sections
Citation Types
Year Published
Publication Types
Relationship
Authors
Journals
Currently, most food recognition relies on deep learning for category classification. However, these approaches struggle to effectively distinguish between visually similar food samples, highlighting the pressing need to address fine-grained issues in food recognition. To address these issues, we advocate for a Gaussian and causal-attention model specifically designed for nuanced object recognition. This model involves training to capture Gaussian characteristics in targeted areas, followed by extracting detailed features from the objects, thus improving the target regions' feature mapping capabilities. To counter data drift caused by skewed data distributions, we implement a counterfactual reasoning strategy. Through counterfactual interventions, the effect of the learned image attention mechanism on network predictions is examined, allowing for the optimization of attention weights in detailed image recognition. A learnable loss strategy is also developed to ensure consistent training across various modules, thereby enhancing the precision of the ultimate recognition task. Our method has been validated on four pertinent datasets, where it demonstrated superior performance. Specifically, the Gaussian and Causal-Attention Model (GCAM) has outperformed existing stateof-the-art methods on the ETH-FOOD101, UECFOOD256, and Vireo-FOOD172 datasets and achieved leading results on the CUB-200 dataset.
Currently, most food recognition relies on deep learning for category classification. However, these approaches struggle to effectively distinguish between visually similar food samples, highlighting the pressing need to address fine-grained issues in food recognition. To address these issues, we advocate for a Gaussian and causal-attention model specifically designed for nuanced object recognition. This model involves training to capture Gaussian characteristics in targeted areas, followed by extracting detailed features from the objects, thus improving the target regions' feature mapping capabilities. To counter data drift caused by skewed data distributions, we implement a counterfactual reasoning strategy. Through counterfactual interventions, the effect of the learned image attention mechanism on network predictions is examined, allowing for the optimization of attention weights in detailed image recognition. A learnable loss strategy is also developed to ensure consistent training across various modules, thereby enhancing the precision of the ultimate recognition task. Our method has been validated on four pertinent datasets, where it demonstrated superior performance. Specifically, the Gaussian and Causal-Attention Model (GCAM) has outperformed existing stateof-the-art methods on the ETH-FOOD101, UECFOOD256, and Vireo-FOOD172 datasets and achieved leading results on the CUB-200 dataset.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.