2023
DOI: 10.1007/s13762-023-04900-1
|View full text |Cite
|
Sign up to set email alerts
|

Modeling air quality PM2.5 forecasting using deep sparse attention-based transformer networks

Abstract: Air quality forecasting is of great importance in environmental protection, government decision-making, people's daily health, etc. Existing research methods have failed to effectively modeling long-term and complex relationships in time series PM2.5 data and exhibited low precision in long-term prediction. To address this issue, in this paper a new lightweight deep learning model using sparse attention-based Transformer networks (STN) consisting of encoder and decoder layers, in which a multi-head sparse atte… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 11 publications
(1 citation statement)
references
References 74 publications
0
1
0
Order By: Relevance
“…The involvement of an attention mechanism and other novel adjustments addressed the core challenges faced by RNNs. Beyond the prosperity of NLP, attention-based models or, more specifically, Transformer-based architectures, have also been widely used in atmospheric pollutant forecasting, owing to their excellent adaption with parallelization and avoidance of the vanishing gradient problem while still processing long-range dependencies [43][44][45][46]. However, Transformer still has disadvantages apart from ordinary RNNs, including sequence length limitations and large memory requirements.…”
Section: Introductionmentioning
confidence: 99%
“…The involvement of an attention mechanism and other novel adjustments addressed the core challenges faced by RNNs. Beyond the prosperity of NLP, attention-based models or, more specifically, Transformer-based architectures, have also been widely used in atmospheric pollutant forecasting, owing to their excellent adaption with parallelization and avoidance of the vanishing gradient problem while still processing long-range dependencies [43][44][45][46]. However, Transformer still has disadvantages apart from ordinary RNNs, including sequence length limitations and large memory requirements.…”
Section: Introductionmentioning
confidence: 99%