In this paper, we propose an Emotional Trigger System to impart an automatic emotion expression ability within the humanoid robot REN-XIN, in which the Emotional Trigger is an emotion classification model trained from our proposed Word Mover’s Distance(WMD) based algorithm. Due to the long time delay of the WMD-based Emotional Trigger System, we propose an enhanced Emotional Trigger System to enable a smooth interaction with the robot in which the Emotional Trigger is replaced by a conventional convolution neural network and a long short term memory network (CNN_LSTM)-based deep neural network. In our experiments, the CNN_LSTM based model only need 10 milliseconds or less to finish the classification without a decrease in accuracy, while the WMD-based model needed approximately 6-8 seconds to give a result. In this paper, the experiments are conducted based on the same sub-data sets of the Chinese emotional corpus(Ren_CECps) used in former WMD experiments: one comprises 50% data for training and 50% for testing(1v1 experiment), and the other comprises 80% data for training and 20% for testing(4v1 experiment). The experiments are conducted using WMD, CNN_LSTM, CNN and LSTM. The results show that CNN_LSTM obtains the best F1 score (0.35) in the 1v1 experiment and almost the same accuracy of F1 scores (0.366 vs 0.367) achieved by WMD in the 4v1 experiment. Finally, we present demonstration videos with the same scenario to show the performance of robot control driven by CNN_LSTM-based Emotional Trigger System and WMD-based Emotional Trigger System. To improve the comparison, total manual-control performance is also recorded.