In light of the mounting environmental pressures, especially the significant threat urban air pollution poses to public health, there arises an imperative need to develop a data‐driven model for air pollution prediction. However, contemporary deep learning techniques, such as recurrent neural networks, often struggle to effectively capture the underlying data patterns and distributions, resulting in reduced model stability. To address this gap, this study introduces an ensemble Wasserstein generative adversarial network framework (EWGF) to enhance the stability and accuracy of PM2.5 predictions by facilitating the acquisition of more informative data representations through Wasserstein generative adversarial network. The framework contains an intricate feature extraction pipeline that automatically learns features containing residual information as representations of potential features, effectively ameliorating the underutilization of feature information. We address a nonconvex multi‐objective optimization problem associated with amalgamating diverse Wasserstein generative adversarial network architectures, which enhance the inherent instability of the predictions. Furthermore, an adaptive search strategy is introduced to ascertain the optimal distribution of prediction residuals, thereby expanding the prediction interval estimation method based on residual distribution. We rigorously evaluate the proposed framework using datasets from three major Indian cities, and our experiments unequivocally show that the EWGF outperforms existing solutions in both PM2.5 point prediction and interval prediction, evidenced by an approximate 8.07% reduction in mean absolute percentage error and an approximate 19.41% improvement in prediction interval score compared to the baseline model.