To solve the problems of poor authenticity and lack of clarity for short-time typhoon prediction, we propose a self-attentional spatiotemporal adversarial network (SASTA-Net). First, we introduce a multispatiotemporal feature fusion method to fully extract and fuse the multichannel spatiotemporal feature information to effectively enhance feature expression. Second, we propose an SATA-LSTM prediction model that incorporates spatial memory cell and attention mechanisms in order to capture spatial features and important details in sequences. Finally, a spatiotemporal 3D discriminator is designed to correctly distinguish the generated predicted cloud image from the real cloud image and generate a more accurate and real typhoon cloud image by adversarial training. The evaluation results on the typhoon cloud image data set show that the proposed SASTA-Net achieves 67.3, 0.878, 31.27, and 56.48 in mean square error, structural similarity, peak signal to noise ratio, and sharpness, respectively, which is superior to the most advanced prediction algorithm.