Human Activity Recognition (HAR) is an introduction to human activities that refer to the movements performed by an individual on specific body parts. One branch of HAR is human emotion. Facial emotion is vital in human communication to help convey emotional states and intentions. Facial Expression Recognition (FER) is crucial to understanding how humans communicate. Misinterpreting Facial Expressions can lead to misunderstanding and difficulty reaching a common ground. Deep Learning can help in recognizing these facial expressions. To improve the probation of Facial Expressions Recognition, we propose ResNet attached with an Attention module to push the performance forward. This approach performs better than the standalone ResNet because the localization and sampling grid allows the model to learn how to perform spatial transformations on the input image. Consequently, it improves the model's geometric invariance and picks up the features of the expressions from the human face, resulting in better classification results. This study proves the proposed method with attention is better than without, with a test accuracy of 0.7789 on the FER dataset and 0.8327 on the FER+ dataset. It concludes that the Attention module is essential in recognizing Facial Expressions using a Convolutional Neural Network (CNN). Advice for further research first, add more datasets besides FER and FER+, and second, add a Scheduler to decrease the learning rate during the training data.