2021
DOI: 10.1007/978-3-030-85713-4_11
|View full text |Cite
|
Sign up to set email alerts
|

Evaluation of the Transformer Architecture for Univariate Time Series Forecasting

Abstract: The attention-based Transformer architecture is earning increasing popularity for many machine learning tasks. In this study, we aim to explore the suitability of Transformers for time series forecasting, which is a crucial problem in different domains. We perform an extensive experimental study of the Transformer with different architecture and hyper-parameter configurations over 12 datasets with more than 50,000 time series. The forecasting accuracy and computational efficiency of Transformers are compared w… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(4 citation statements)
references
References 16 publications
0
4
0
Order By: Relevance
“…A Decoder-only Transformer structure-based method, geared towards univariate time-series forecasting, is introduced (Lara- Benítez et al, 2021). In this investigation, the conventional Transformer Decoder blocks were adhered to, wherein each decoder block comprises a masked self-attention module followed by multihead attention and a feed-forward block.…”
Section: Related Workmentioning
confidence: 99%
“…A Decoder-only Transformer structure-based method, geared towards univariate time-series forecasting, is introduced (Lara- Benítez et al, 2021). In this investigation, the conventional Transformer Decoder blocks were adhered to, wherein each decoder block comprises a masked self-attention module followed by multihead attention and a feed-forward block.…”
Section: Related Workmentioning
confidence: 99%
“…The seq2seq [41] model based on the Attention mechanism has emerged in prediction [42], but it has not yet been used in univariate time series prediction. A study [43] comparing Transformer and LSTM solutions to prediction problems pointed out the limitations of Transformer in terms of computation and parameter handling.…”
Section: Related Work a Univariate Time Series Forecastingmentioning
confidence: 99%
“…Unlike LSTM, CNN, and GRU, which focus more on sequence-based learning, the parallel approach of Transformer allows it to overcome long-range constraints in time series. Recent studies [14]- [16] have demonstrated that Deep Transformer (DT) models with attention mechanisms outperform LSTM, CNN, and GRU in time series prediction tasks.…”
mentioning
confidence: 99%