Facial expression recognition (FER) in the wild is an active and challenging field of research. A system for automatic FER finds use in a wide range of applications related to advanced human–computer interaction (HCI), human–robot interaction (HRI), human behavioral analysis, gaming and entertainment, etc. Since their inception, convolutional neural networks (CNNs) have attained state‐of‐the‐art accuracy in the facial analysis task. However, recognizing facial expressions in the wild with high confidence running on a low‐cost embedded device remains challenging. To this end, this study presents an efficient dual‐channel ensembled deep CNN (DCE‐DCNN) for FER in the wild. Initially, two DCNNs, namely the and , are trained separately on the grayscale and Scharr‐convolved vertical gradient facial images, respectively. The proposed network later integrates the two pre‐trained DCNNs to obtain the dual‐channel integrated DCNN (DCI‐DCNN). Finally, all three neural networks, namely the , , and DCI‐DCNN, are jointly fine‐tuned to get a single dual‐channel‐multi‐output model. The multi‐output model produces three prediction scores for the given input facial image. The prediction scores are thus fused using the max‐voting ensemble scheme to obtain the DCE‐DCNN with the final classification label. On the FER2013, RAF‐DB, NCAER‐S, AffectNet, and CKPlus benchmark FER datasets, the proposed DCE‐DCNN consistently outperforms the two individual DCNNs and numerous state‐of‐the‐art CNNs. Moreover, the network achieves competitive recognition accuracy on all four FER in the wild datasets with reduced memory storage size and parameters. The proposed DCE‐DCNN model with high throughput on resource‐limited embedded devices is suitable for applications that seek real‐time classification of facial expressions in the wild with high confidence.