“…Interestingly, the top scoring system in the 2013 FER Challenge is a deep convolutional neural network [34], while the best handcrafted model ranked only on the fourth place [15]. With only a few exceptions [1,32,33], most of the recent works on facial expression recognition are based on deep learning [2,9,10,13,14,17,21,22,24,23,26,28,38,39,40]. Some of these recent works [14,17,21,38,39] proposed to train an ensemble of convolutional neural networks for improved performance, while others [6,16] combined deep features with handcrafted features such as SIFT [25] or Histograms of Oriented Gradients (HOG) [8].…”