A HYBRID DEEP LEARNING MODEL FOR FORECASTING PM2.5 AIR POLLUTANT CONCENTRATIONS

Massad, Asma

A HYBRID DEEP LEARNING MODEL FOR FORECASTING PM2.5 AIR POLLUTANT CONCENTRATIONS

dc.contributor.author	Massad, Asma
dc.date.accessioned	2025-01-23T10:40:51Z
dc.date.available	2025-01-23T10:40:51Z
dc.date.issued	2024-12-18
dc.description.abstract	Air quality forecasting is a crucial research field that aids scientists and policymakers in making informed decisions to combat air pollution. Among various pollutants, PM2.5 -particulate matter with a diameter smaller than 2.5 micrometers- poses significant health risks, as it can reach the lower respiratory tract and enter the bloodstream. Accurately forecasting PM2.5 levels is thus essential. Although machine learning-based spatiotemporal forecasting models have advanced, the pursuit for more accurate forecasts continues. The use of hybrid deep learning models for PM2.5 forecasting represents a promising and active area of research, as these models aim to capture complex spatiotemporal dependencies more effectively. We developed a Dynamic Graph Attention Network (DyGAT) to model spatial dependencies effectively. DyGAT leverages engineered edge features, including distance, wind speed, and wind direction, while using attention mechanisms to capture the dynamic nature of these dependencies. DyGAT was then combined with Informer, a Transformer for efficient time-series forecasting, to capture spatial and temporal patterns comprehensively, improving prediction accuracy. Our model was evaluated on a benchmark dataset from Beijing, with 420,768 records over four years. DyGAT-Informer outperformed a version without the DyGAT component and other baseline models. It achieved 50.43 for MAE, 79.9 for RMSE and 28.88% for SMAPE, compared to 51.44 for MAE, 80.83 for RMSE and 30.25% for SMAPE in the next best model. Additionally, we conducted a case study using a dataset from Nablus, Palestine, consisting of 2692 records per station over a two months period. We incorporated geospatial features about nearby pollution sources into the dataset. Due to the insufficient number of records in the Nablus dataset for training the Informer, it was replaced with a sequence-to-sequence Long Short-Term Memory (LSTM) model. DyGAT-LSTM, trained with additional geospatial features about nearby pollution sources, achieved a 2.08% reduction in MAE, 1.17% in RMSE, and 1.96% in SMAPE. This confirms the benefit of incorporating such data. Finally, despite the short distances between stations, DyGAT successfully captured spatial dependencies, where DyGAT-LSTM achieved a reduction of 3.13% in MAE, 1.48% in RMSE, and 3.67% in SMAPE when compared to the LSTM-only model.
dc.identifier.uri	https://hdl.handle.net/20.500.11888/19824
dc.language.iso	en
dc.publisher	An-Najah National University
dc.supervisor	Toma, Anas
dc.supervisor	Khader, Abdelhaleem
dc.title	A HYBRID DEEP LEARNING MODEL FOR FORECASTING PM2.5 AIR POLLUTANT CONCENTRATIONS
dc.title.alternative	نموذج تعلم عميق هجين للتنبؤ بتركيزات ملوث الهواء PM2.5
dc.type	Thesis

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Thesis_Final.pdf
Size:: 7.01 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Artificial Intelligence