The National Institute of Environmental Research, under the Ministry of Environment of Korea, provides two-day forecasts, through AirKorea, of the concentration of particulate matter with diameters of ≤ 2.5 μm (PM2.5) in terms of four grades (low, moderate, high, and very high) over 19 districts nationwide. Particulate grades are subjectively designated by human forecasters based on forecast results from the Community Multiscale Air Quality (CMAQ) and artificial intelligence (AI) models in conjunction with weather patterns. This study evaluates forecasts from the long short-term memory (LSTM) algorithm relative to those from CMAQ-solely and AirKorea using observations from 2019. The skills of the one-day PM2.5 forecasts over the 19 districts were 39–70% for CMAQ, 72–79% for LSTM, and 73–80% for AirKorea; the AI forecasts showed comparable skills to the human forecasters at AirKorea. The one-day forecast skill levels of high and very high PM2.5 pollution grades are 31–98%, 31–74%, and 39–81% for the CMAQ-solely, the LSTM, and the AirKorea forecasts, respectively. Despite good skills for forecasting the high and very high events, CMAQ-solely forecasts also generate substantially higher false alarm rates (up to 86%) than the LSTM and AirKorea forecasts (up to 58%). Hence, applying only the LSTM model to the CMAQ forecasts can yield reasonable forecast skill levels comparable to the operational AirKorea forecasts that elaborately combine the CMAQ model, AI models, and human forecasters. The present results suggest that applications of appropriate AI models can greatly enhance PM2.5 forecast skills for Korea in a more objective way.