In recent years, human health across the world is becoming concerned by a constant threat of air pollution, which causes many chronic diseases and premature mortalities. Poor air quality does not have only serious adverse effects on human health and vegetation, but also some major negative political, societal, and economic impacts. Hence, it is essential investing more effort on accurate forecasting of ambient air pollution to provide practical and relevant solutions, achieve acceptable air quality, and plan for prevention. In this work, we propose a flexible and efficient deep learning-driven model to forecast concentrations of ambient pollutants. The paper introduces first the traditional Variational AutoEncoder (VAE) and the attention mechanism to develop the forecasting modeling strategy based on the innovative Integrated Multiple Directed Attention Deep Learning architecture (IMDA). To assess the performance of the proposed forecasting methodology, experimental validation is then performed using air pollution data from four US states. Six statistical indicators have been used to evaluate the forecasting accuracy. A discussion of the results obtained finally demonstrates the satisfying performance of IMDA-VAE methods to forecast different pollutants in different locations. Furthermore, results indicate that the proposed IMDA-VAE model can effectively improve air pollution forecasting performance and outperforms the deep learning models, namely VAE, Long Short-Term Memory (LSTM), Gated Recurrent Units (GRU), bidirectional LSTM, bidirectional GRU, and ConvLSTM. We also showed that the forecasting results of the proposed model surpass the performance of LSTM and GRU with the attention mechanism.