With the spread of powerful AI models, human-machine interaction has improved rapidly. In this situation, recognizing facial expressions of human emotion is essential for successful human-machine collaboration/ teaming. While Convolutional Neural Network (CNN) models were widely used for facial emotion classification, Transformer-based models, known for excelling in NLP tasks, have demonstrated superior performance in areas like image classification, semantic segmentation, and object detection. This study investigates the effectiveness of using a transformer based model, the Face-transformer model a powerful tool for facial identification, that will be fine-tuned for the Facial Emotion Recognition (FER) challenge. We aim to modify the face transformer architecture to recognize emotional states from facial photos using the extensive facial emotion recognition datasets, opening the door for more natural and responsive machine interactions. Our initial findings suggest that the Face-Transformer model holds promise for bridging the gap between machine interpretability and human emotions, potentially paving the way for more natural and responsive human-computer interactions.