In this thesis project, a special type of neural network: Extreme Learning Machine (ELM) is implemented to predict the air quality based on the air quality time series itself and the external meteorological records. A regularized version of ELM with linear components is chosen to be the main model for prediction. To take full advantage of this model, its hyper-parameters are studied and optimized. Then a set of variables is selected (or constructed) to maximize the performance of ELM, where two different variable selection methods (i.e. wrapper and filtering methods) are evaluated. The wrapper method ELM-based forward selection is chosen for the variable selection. Meanwhile, a feature extraction method (Principal Component Analysis) is implemented in the hope of reducing the candidate meteorological variables for feature selection, which proves to be helpful. At last, with all the parameters being properly optimized, ELM is used for the prediction and generates satisfying results.