This study aims to find the optimal learning algorithm parameter, model and connection, initialization weight and normalization method using fused Convolutional Neural Network (CNN) for facial expression recognition. The best model and parameters are identified using a ten-fold cross validation method. By determining these ideal elements, a superior accuracy can potentially be achieved. CNN was utilized to a group of seven emotions from various facial expressions, namely, happy, sad, angry, surprise, disgust, fear and neutral. The four layer CNN configuration was prepared with the JAFFE dataset, and yielded an overall accuracy of 83.72%. The outcome demonstrates that the fused CNN with the mentioned aims can generate higher accuracy with a smaller network compared to related models.