Emotion recognition has shown many valuable roles in people's lives under the background of artificial intelligence technology. However, most existing emotion recognition methods have poor recognition performance, which prevents their promotion in practical applications. To alleviate this problem, we proposed an expression-EEG interaction multi-modal emotion recognition method using a deep automatic encoder. Firstly, decision tree is applied as objective feature selection method. Then, based on the facial expression features recognized by sparse representation, the solution vector coefficients are analyzed to determine the facial expression category of the test samples. After that, the bimodal deep automatic encoder is adopted to fuse the EEG signals and facial expression signals. The third layer of BDAE extracts features for training of supervised learning. Finally, LIBSVM classifier is used to complete classification task. We carried out experiments on a constructed video library to verify the proposed emotion recognition method. The results show that the proposed method can effectively extract and integrate high-level emotion-related features in EEG and facial expression signals. The recognition rate of discrete emotion state type and the average emotion recognition rate have been improved relatively, in which the average emotion recognition rate is 85.71%. Overall, the emotion recognition ability has been greatly improved.