This review paper provides a comprehensive analysis of recent advancements in Facial Expression Recognition (FER) through various deep learning models. Seven state-of-the-art models are scrutinized, each offering unique contributions to the field. The MBCC-CNN model demonstrates improved recognition rates on diverse datasets, addressing the challenges of facial expression recognition through multiple branches and cross-connected convolutional neural networks. The Deep Graph Fusion model introduces a novel approach for predicting viewer expressions from videos, showcasing superior performance on the EEV database. Multimodal emotion recognition is explored in the EEG and facial expression fusion model, achieving high accuracy on the DEAP dataset. The Spark-based LDSP-TOP descriptor, coupled with a 1-D CNN and LSTM Autoencoder, excels in capturing temporal dynamics for facial expression understanding. Vision transformers for micro-expression recognition exhibit outstanding accuracy on datasets like CASMEI, CASME-II, and SAMM. Additionally, a hierarchical deep learning model is proposed for evaluating teaching states based on facial expressions. Lastly, a visionary transformer model achieves remarkable recognition accuracy of 100% on SAMM dataset, showcasing the potential of combining convolutional and transformer architectures. This review synthesizes key findings, highlights model performances, and outlines directions for future research in FER.