Anomaly detection in video surveillance is crucial but challenging due to the rarity of irregular events and ambiguity of defining anomalies. We propose a method called AONet that utilizes a spatiotemporal module to extract spatiotemporal features efficiently, as well as a residual autoencoder equipped with an attention network for effective future frame prediction in video anomaly detection. AONet utilizes a novel activation function called OptAF that combines the strengths of the ReLU, leaky ReLU, and sigmoid functions. Furthermore, the proposed method employs a combination of robust loss functions to address various aspects of prediction errors and enhance training effectiveness. The performance of the proposed method is evaluated on three widely used benchmark datasets. The results indicate that the proposed method outperforms existing state‐of‐the‐art methods and demonstrates comparable performance, achieving area under the curve values of 97.0%, 86.9%, and 73.8% on the UCSD Ped2, CUHK Avenue, and ShanghaiTech Campus datasets, respectively. Additionally, the high speed of the proposed method enables its application to real‐time tasks.