Aspect-Based (also known as aspect-level) Sentiment Classification (ABSC) aims at determining the sentimental tendency of a particular target in a sentence. With the successful application of the attention network in multiple fields, attention-based ABSC has aroused great interest. However, most of the previous methods are difficult to parallelize, insufficiently obtain, and fuse the interactive information. In this paper, we proposed a Multiple Interactive Attention Network (MIN). First, we used the Bidirectional Encoder Representations from Transformers (BERT) model to pre-process the data. Then, we used the partial transformer to obtain a hidden state in parallel. Finally, we took the target word and the context word as the core to obtain and fuse the interactive information. Experimental results on the different datasets showed that our model was much more effective.Appl. Sci. 2020, 10, 2052 2 of 15 words [7]. However, these models simply average the aspect or context vector to guide learning the attention weight on the context or aspect words. Therefore, these models are still in the preliminary stage in dealing with fine-grained sentiment analysis.In conclusion, there are two problems with previous approaches. The first problem is that previous approaches are difficult to obtain the hidden state interactively in parallel. Another problem is to insufficiently obtain and fuse contextual information and aspect information.This paper proposed a model named Multiple Interactive Attention Network (MIN) to address these problems. To address the first problem, we took advantage of Multi-Head Attention (MHA) to obtain useful interactive information. To address another problem, we adopted target-context pair and Context-Target-Interaction (CTI) in our model.The main contributions of this paper are presented as follows:1.We took advantage of MHA and Location-Point-Wise Feed-Forward Networks (LPFFN) to obtain the hidden state interactively in parallel. Besides, we applied pre-trained Bidirectional Encoder Representations from Transformers (BERT) [8] in our model.
2.We used the CTI and target-context pair to help us obtain and fuse useful information. We also verified the effectiveness of these two methods.
3.We experimented on different public authoritative datasets: restaurant reviews and laptop reviews of the SemEval-2014 Task 4 dataset, the ACL(Annual Meeting of the Association for Computational Linguistics) 14 Twitter dataset, SemEval-2015 Task 12 dataset, SemEval-2016 Task 5 dataset. The experimental results showed our model outperformed state-of-the-art methods.