Visibility prediction affects travel safety and is a major concern for highway and airline authorities. However, current visibility prediction methods based on video data usually capture some features in static images, which are relatively simple and are not commonly used for continuous variation information of video data. This study proposes a visibility prediction method based on the spatio-temporal variation features of a video. Through the spatial division of the images from the video, the optimal spatial distribution of different features in the image is mapped using the correlation matrix, and the spatio-temporal variation information of the video is extracted based on the short-term stationary features of fog spatio-temporal evolution. Finally, the relationship between the spatiotemporal variation features of the video and visibility is constructed by combining the time-frequency localization features of the wavelet transform and the self learning ability of the neural network. Compared with the predicted results of traditional static image features, theR2, RMSE, and MAE of the visibility prediction results were improved by 0.1698 and reduced by 58.4142 and 20.0427, respectively. In addition, the R2, RMSE, and MAE of the prediction results based on wavelet neural networks can reach 0.9817, 28.3365, and 19.2098, respectively, compared to current mainstream prediction methods. This fully proves the scientificity and effectiveness of the method, which can be applied to the accurate monitoring of visibility in fog weather based on video data.