Overcoming the Limitations of Learning-Based VQA for Counting Questions with Zero-Shot Learning
A. Lubna,
Saidalavi Kalady
Abstract:Visual question answering (VQA) research has garnered increasing attention in recent years. It is considered a visual Turing test because it requires a computer to respond to textual questions based on an image. Expertise in computer vision, natural language processing, knowledge understanding, and reasoning is required to solve the problem of VQA. Most techniques employed for VQA consist of models that are developed to learn the combination of image and question features along with the expected answer. The te… Show more
Set email alert for when this publication receives citations?
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.