Emotion recognition is essential for computers to understand human emotions. Traditional EEG emotion recognition methods have significant limitations. To improve the accuracy of EEG emotion recognition, we propose a multiview feature fusion attention convolutional recurrent neural network (multi-aCRNN) model. Multi-aCRNN combines CNN, GRU, and attention mechanisms to fuse features from multiple perspectives deeply. Specifically, multiscale CNN can unite elements in the frequency and spatial domains through the convolution of different scales. The role of the attention mechanism is to weigh the frequency domain and spatial domain information of different periods to find more valuable temporal perspectives. Finally, the implicit feature representation is learned from the time domain through the bidirectional GRU to achieve the profound fusion of features from multiple perspectives in the time domain, frequency domain, and spatial domain. At the same time, for the noise problem, we use label smoothing to reduce the influence of label noise to achieve a better emotion recognition classification effect. Finally, the model is validated on the EEG data of 32 subjects on a public dataset (DEAP) by fivefold cross-validation. Multi-aCRNN achieves an average classification accuracy of 96.43% and 96.30% in arousal and valence classification tasks, respectively. In conclusion, multi-aCRNN can better integrate EEG features from different angles and provide better classification results for emotion recognition.