Addressing the constraints inherent in traditional primary Air Quality Index (AQI) forecasting models and the shortcomings in the exploitation of meteorological data, this research introduces a novel air quality prediction methodology leveraging machine learning and the enhanced modeling of secondary data. The dataset employed encompasses forecast data on primary pollutant concentrations and primary meteorological conditions, alongside actual meteorological observations and pollutant concentration measurements, spanning from 23 July 2020 to 13 July 2021, sourced from long-term air quality projections at various monitoring stations within Jinan, China. Initially, through a rigorous correlation analysis, ten meteorological factors were selected, comprising both measured and forecasted data across five categories each. Subsequently, the significance of these ten factors was assessed and ranked based on their impact on different pollutant concentrations, utilizing a combination of univariate and multivariate significance analyses alongside a random forest approach. Seasonal characteristic analysis highlighted the distinct seasonal impacts of temperature, humidity, air pressure, and general atmospheric conditions on the concentrations of six key air pollutants. The performance evaluation of various machine learning-based classification prediction models revealed the Light Gradient Boosting Machine (LightGBM) classifier as the most effective, achieving an accuracy rate of 97.5% and an F1 score of 93.3%. Furthermore, experimental results for AQI prediction indicated the Long Short-Term Memory (LSTM) model as superior, demonstrating a goodness-of-fit of 91.37% for AQI predictions, 90.46% for O3 predictions, and a perfect fit for the primary pollutant test set. Collectively, these findings affirm the reliability and efficacy of the employed machine learning models in air quality forecasting.