Artificial Intelligence techniques powered by deep neural nets have achieved much success in several application domains, most significantly and notably in the Computer Vision applications and Natural Language Processing tasks. Surpassing human-level performance propelled the research in the applications where different modalities amongst language, vision, sensory, text play an important role in accurate predictions and identification. Several multimodal fusion methods employing deep learning models are proposed in the literature. Despite their outstanding performance, the complex, opaque and black-box nature of the deep neural nets limits their social acceptance and usability. This has given rise to the quest for model interpretability and explainability, more so in the complex tasks involving multimodal AI methods. This paper extensively reviews the present literature to present a comprehensive survey and commentary on the explainability in multimodal deep neural nets, especially for the vision and language tasks. Several topics on multimodal AI and its applications for generic domains have been covered in this paper, including the significance, datasets, fundamental building blocks of the methods and techniques, challenges, applications, and future trends in this domain.INDEX TERMS deep multimodal learning, explainable AI, interpretability, survey, trends, vision and language research, XAI.
Web Information Processing (WIP) has enormously impacted modern society since a huge percentage of the population relies on the internet to acquire information. Social Media platforms provide a channel for disseminating information and a breeding ground for spreading misinformation, creating confusion and fear among the population. One of the techniques for the detection of misinformation is machine learning-based models. However, due to the availability of multiple social media platforms, developing and training AI-based models has become a tedious job. Despite multiple efforts to develop machine learning-based methods for identifying misinformation, there has been very limited work on developing an explainable generalized detector capable of robust detection and generating explanations beyond black-box outcomes. Knowing the reasoning behind the outcomes is essential to make the detector trustworthy. Hence employing explainable AI techniques is of utmost importance. In this work, the integration of two machine learning approaches, namely domain adaptation and explainable AI, is proposed to address these two issues of generalized detection and explainability. Firstly the Domain Adversarial Neural Network (DANN) develops a generalized misinformation detector across multiple social media platforms. DANN is employed to generate the classification results for test domains with relevant but unseen data. The DANN-based model, a traditional black-box model, cannot justify and explain its outcome, i.e., the labels for the target domain. Hence a Local Interpretable Model-Agnostic Explanations (LIME) explainable AI model is applied to explain the outcome of the DANN model. To demonstrate these two approaches and their integration for effective explainable generalized detection, COVID-19 misinformation is considered a case study. We experimented with two datasets and compared results with and without DANN implementation. It is observed that using DANN significantly improves the F1 score of classification and increases the accuracy by 5% and AUC by 11%. The results show that the proposed framework performs well in the case of domain shift and can learn domain-invariant features while explaining the target labels with LIME implementation. This can enable trustworthy information processing and extraction to combat misinformation effectively.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.