In the context of the ongoing digital transformation, the effective monitoring of student attendance holds paramount significance for educational establishments. This study presents an innovative approach using Vision Transformer technology for iris recognition to automate student attendance tracking. We fine-tuned Vision Transformer models, specifically ViT-B16, ViT-B32, ViT-L16, and ViT-L32, using the CASIA-Iris-Syn dataset and focused on overcoming challenges related to intra-class variation through data augmentation techniques, including rotation, shearing, and brightness adjustments. The results reveal that ViT-L16 is the most proficient, achieving an impressive accuracy of 95.69%. Comparative analysis with prior methodologies, specifically those employing Vision Transformer with Convolutional Neural Network, underscores the superiority of our proposed ViT-L16 model. This superiority is evident across various metrics, including accuracy, precision, recall, and F1 score. The experimental setup involves the use of Jupyter Notebook, Python technologies, TensorFlow, and Keras, emphasizing evaluations based on loss, accuracy, and Confusion Matrix. ViT-L16 consistently outshines other models, showcasing its resilience in iris recognition for student attendance. This research marks a significant step towards modernizing attendance systems, offering an accurate and automated solution suitable for the evolving needs of educational settings. Future work could explore integrating additional biometric modalities and refining Vision Transformer architecture for enhanced performance and broader application in educational environments.