The goal of image anomaly detection is to determine whether there is an abnormality in an image. Image anomaly detection is currently used in various fields such as medicine, intelligent information, military fields, and manufacturing. The encoder–decoder structure, which learns a normal-looking periodic normal pattern and shows good performance in judging anomaly scores through reconstruction errors showing the differences between the reconstructed images and the input image, is widely used in the field of anomaly detection. The existing image anomaly detection method extracts normal information through local features of the image, but the vision transformer base and the probability distribution are generated by learning the global relationship between image anomaly detection and an image patch that can locate anomalies to extract normal information. We propose Vision Transformer and VAE for Anomaly Detection (ViV-Ano), an anomaly detection model that combines a model variational autoencoder (VAE) with Vision Transformer (ViT). The proposed ViV-Ano model showed similar or better performance when compared to the existing model on a benchmark dataset. In addition, an MVTec anomaly detection dataset (MVTecAD), a dataset for industrial anomaly detection, showed similar or improved performance when compared to the existing model.