A machine learning-based air quality forecast system for Pacific Northwest


A machine learning-based air quality forecast system for Pacific Northwest

Fan, K.; Dhammapala, R.; Harrington, K.; Lamastro, R.; Lamb, B.; Lee, Y. H.

Chemical transport models (CTMs) are widely used for air quality forecasts but require heavy computational burden and often suffer from a systematic bias that leads to missed poor air pollution events. In this research, we developed a machine learning (ML) modeling framework to provide O3 and PM2.5 forecasts at the monitoring sites throughout the Pacific Northwest (PNW). We used the historical archives from the Weather Research and Forecasting (WRF) meteorological model forecasts, and Air Quality System (AQS) observation datasets of O3 and PM2.5 to build a reliable forecasting system that consists of two ML models, ML1 and ML2: ML1 uses the random forest (RF) classifier and multiple linear regression (MLR) models, and ML2 uses a two-phase RF regression model with best-fit weighting factors. The 10-time, 10-fold cross-validation analysis is used to evaluate our ML forecasting system. Compared to the air quality forecasts based on a CTM, ML1 improves forecast skill for mid-to-high O3 events, which captures 77% more unhealthy events, while ML2 improves forecast skill for low-to-mid O3 events (R2 = 0.78). For PM2.5, our ML model performs well regardless of a season and captures 70% more summer and 30% more winter high-PM2.5 events while the CTM shows systematic biases during summer and winter: 31% underprediction during summer and 5.1% overprediction during winter. The ML modeling framework is now used as an operational forecast of O3 and PM2.5 at the monitoring sites in the PNW.

  • Lecture (Conference) (Online presentation)
    ML for Earth System Modelling and Analytics workshop 2021, 03.-04.05.2021, Görlitz, Germany

Permalink: https://www.hzdr.de/publications/Publ-33690
Publ.-Id: 33690