Nitrogen Dioxide (NO$$_{2}$$
2
) is a common air pollutant associated with several adverse health problems such as pediatric asthma, cardiovascular mortality,and respiratory mortality. Due to the urgent society’s need to reduce pollutant concentration, several scientific efforts have been allocated to understand pollutant patterns and predict pollutants’ future concentrations using machine learning and deep learning techniques. The latter techniques have recently gained much attention due it’s capability to tackle complex and challenging problems in computer vision, natural language processing, etc. In the NO$$_{2}$$
2
context, there is still a research gap in adopting those advanced methods to predict the concentration of pollutants. This study fills in the gap by comparing the performance of several state-of-the-art artificial intelligence models that haven’t been adopted in this context yet. The models were trained using time series cross-validation on a rolling base and tested across different periods using NO$$_{2}$$
2
data from 20 monitoring ground-based stations collected by Environment Agency- Abu Dhabi, United Arab Emirates. Using the seasonal Mann-Kendall trend test and Sen’s slope estimator, we further explored and investigated the pollutants trends across the different stations. This study is the first comprehensive study that reported the temporal characteristic of NO$$_{2}$$
2
across seven environmental assessment points and compared the performance of the state-of-the-art deep learning models for predicting the pollutants’ future concentration. Our results reveal a difference in the pollutants concentrations level due to the geographic location of the different stations, with a statistically significant decrease in the NO$$_{2}$$
2
annual trend for the majority of the stations. Overall, NO$$_{2}$$
2
concentrations exhibit a similar daily and weekly pattern across the different stations, with an increase in the pollutants level during the early morning and the first working day. Comparing the state-of-the-art model performance transformer model demonstrate the superiority of ( MAE:0.04 (± 0.04),MSE:0.06 (± 0.04), RMSE:0.001 (± 0.01), R$$^{2}$$
2
: 0.98 (± 0.05)), compared with LSTM (MAE:0.26 (± 0.19), MSE:0.31 (± 0.21), RMSE:0.14 (± 0.17), R$$^{2}$$
2
: 0.56 (± 0.33)), InceptionTime (MAE: 0.19 (± 0.18), MSE: 0.22 (± 0.18), RMSE:0.08 (± 0.13), R$$^{2}$$
2
:0.38 (± 1.35) ), ResNet (MAE:0.24 (± 0.16), MSE:0.28 (± 0.16), RMSE:0.11 (± 0.12), R$$^{2}$$
2
:0.35 (± 1.19) ), XceptionTime (MAE:0.7 (± 0.55), MSE:0.79 (± 0.54), RMSE:0.91 (± 1.06), R$$^{2}$$
2
: $$-$$
-
4.83 (± 9.38) ), and MiniRocket (MAE:0.21 (± 0.07), MSE:0.26 (± 0.08), RMSE:0.07 (± 0.04), R$$^{2}$$
2
: 0.65 (± 0.28) ) to tackle this challenge. The transformer model is a powerful model for improving the accurate forecast of the NO$$_{2}$$
2
levels and could strengthen the current monitoring system to control and manage the air quality in the region.