UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Air quality prediction by machine learning methods Peng, Huiping


As air pollution is a complex mixture of toxic components with considerable impact on humans, forecasting air pollution concentration emerges as a priority for improving life quality. In this study, air quality data (observational and numerical) were used to produce hourly spot concentration forecasts of ozone (O₃), particulate matter 2.5μm (PM₂.₅) and nitrogen dioxide (NO₂), up to 48 hours for six stations across Canada -- Vancouver, Edmonton, Winnipeg, Toronto, Montreal and Halifax. Using numerical data from an air quality model (GEM-MACH15) as predictors, forecast models for pollutant concentrations were built using multiple linear regression (MLR) and multi-layer perceptron neural networks (MLP NN). A relatively new method, the extreme learning machine (ELM), was also used to overcome the limitation of linear methods as well as the large computational demand of MLP NN. In operational forecasting, the continuous arrival of new data means frequent updating of the models is needed. This type of learning, called online sequential learning, is straightforward for MLR and ELM but not for MLP NN. Forecast performance of the online sequential MLR (OSMLR) and online sequential ELM (OSELM), together with stepwise MLR, all updated daily were compared with MLP NN updated seasonally, and the benchmark, updatable model output statistics (UMOS) from Environmental Canada. Overall OSELM tended to slightly outperform the other models including UMOS, being most successful with ozone forecasts and least with PM₂.₅ forecasts. MLP NN updated seasonally was generally underperforming the linear models MLR and OSMLR, indicating the need to update a nonlinear model frequently.

Item Media

Item Citations and Data


Attribution-NonCommercial-NoDerivs 2.5 Canada