Purpose
The purpose of this study is to develop a synthesized conceptual framework for artificial intelligence (AI) adoption in the field of business-to-business (B2B) marketing.
Design/methodology/approach
A conceptual development approach has been adopted, based on a content analysis of 59 papers in peer-reviewed academic journals, to identify drivers, barriers, practices and consequences of AI adoption in B2B marketing. Based on these analyses and findings, a conceptual model is developed.
Findings
This paper identifies the following two key drivers of AI adoption: the shortcomings of current marketing activities and the external pressure imposed by informatization. Seven outcomes are identified, namely, efficiency improvements, accuracy improvements, better decision-making, customer relationship improvements, sales increases, cost reductions and risk reductions. Based on information processing theory and organizational learning theory (OLT), an integrated conceptual framework is developed to explain the relationship between each construct of AI adoption in B2B marketing.
Originality/value
This study is the first conceptual paper that synthesizes drivers, barriers and outcomes of AI adoption in B2B marketing. The conceptual model derived from the combination of information processing theory and OLT provides a comprehensive framework for future work and opens avenues of research on this topic. This paper contributes to both AI literature and B2B literature.
Visual Question Answering (VQA) is a challenging multi-modal learning task since it requires an understanding of both visual and textual modalities simultaneously. Therefore, the approaches used to represent the images and questions in a fine-grained manner play key roles in the performance. In order to obtain the fine-grained image and question representations, we develop a co-attention mechanism using an end-to-end deep network architecture to jointly learn both the image and the question features. Specifically, textual attention implemented by a self-attention model will reduce unrelated information and extract more discriminative features for question-level representations, which is in turn used to guide visual attention. We also note that a lot of finished works use complex models to extract feature representations but neglect to use high-level information summary such as question types in learning. Hence, we introduce the question type in our work by directly concatenating it with the multi-modal joint representation to narrow down the candidate answer space. A new network architecture combining the proposed co-attention mechanism and question type provides a unified model for VQA. The extensive experiments on two public datasets demonstrate the effectiveness of our model as compared with several state-of-the-art approaches.INDEX TERMS Co-attention, question type, self-attention, visual question answering. FIGURE 1. Examples of different questions in VQA. Q=question, A=answer.most models directly learn the joint embedding of visual and textual features through linear pooling (such as elementwise addition or multiplication) and then feed it into a classifier to predict the most related answer. Specifically, visual features are obtained with convolutional neural network (CNN) pre-trained on object recognition, and textual features are obtained with recurrent neural network (RNN). However, these visual and textual features are represented at
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.