A Time Series is Worth 64 Words: Long-term Forecasting with Transformers

Nie, Yuqi; Nguyen, Nam Hoai; Sinthong, Phanwadee; Kalagnanam, Jayant

doi:10.48550/arxiv.2211.14730

Cited by 39 publications

(76 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…The superior performance of ResNet is not surprising as the results are in agreement with [16]. The success of Transformer is also expected, as Transformers have achieved state-of-the-art performance in natural language processing [4,20,28], computer vision [7,12,13], and other time series problems [21,31,39]. On the other hand, the performance of RNN-based models (LSTM and GRU) is worse than simple baselines like ED and DTW.…”

Section: Methodsmentioning

confidence: 63%

Toward a Foundation Model for Time Series Data

Yeh,

Dai,

Chen

et al. 2023

Proceedings of the 32nd ACM International Conference on Information and Knowledge Management

View full text Add to dashboard Cite

A foundation model is a machine learning model trained on a large and diverse set of data, typically using self-supervised learningbased pre-training techniques, that can be adapted to various downstream tasks. However, current research on time series pre-training has predominantly focused on models trained exclusively on data from a single domain. As a result, these models possess domainspecific knowledge that may not be easily transferable to time series from other domains. In this paper, we aim to develop an effective time series foundation model by leveraging unlabeled samples from multiple domains. To achieve this, we repurposed the publicly available UCR Archive and evaluated four existing self-supervised learning-based pre-training methods, along with a novel method, on the datasets. We tested these methods using four popular neural network architectures for time series to understand how the pre-training methods interact with different network designs. Our experimental results show that pre-training improves downstream classification tasks by enhancing the convergence of the fine-tuning process. Furthermore, we found that the proposed pre-training method, when combined with the Transformer, outperforms the alternatives. The proposed method outperforms or achieves equal performance compared to the second best method in ∼ 93% of downstream tasks. CCS CONCEPTS• Computing methodologies → Neural networks; Temporal reasoning.

show abstract

Section: Methodsmentioning

confidence: 63%

Toward a Foundation Model for Time Series Data

Yeh,

Dai,

Chen

et al. 2023

Proceedings of the 32nd ACM International Conference on Information and Knowledge Management

View full text Add to dashboard Cite

show abstract

“…The longterm patterns cannot be solved simply by extending the length of the lookback window because the long-and short-term repetitive patterns of PM temperature change do not have a certain regularity. Furthermore, increasing the lookback window's duration requires using more memory and processing power [29]. This paper suggests a convolutional neural network skip (CNN-skip) layer to capture the long-and shortterm local repetitive patterns in the temperature change of PMs by interval sampling to solve this problem.…”

Section: The Main Workmentioning

confidence: 99%

A new LSTNet-based temperature prediction model for permanent magnet

Guo,

Chen,

Wang

et al. 2024

Meas. Sci. Technol.

View full text Add to dashboard Cite

Permanent magnet synchronous motors (PMSMs) can effectively protect against demagnetization by accurate permanent magnet (PM) temperature prediction; nevertheless, due to the nonlinear properties and intricate internal structure of PMSMs, accurate PM temperature prediction methods still encounter difficulties. This paper proposes a new PM temperature prediction model (LSTNet-Improved) based on long- and short-term time series network (LSTNet) to increase the prediction accuracy of PM temperature. By adding a multi-scale convolutional (MSC) layer to the convolutional neural network (CNN) layer of LSTNet, the short-term detail-dependent information between PMSM variables can be obtained by the model, and adds a convolutional neural network skip (CNN-skip) layer in place of the gate recurrent unite skip (GRU-skip) layer to find the long- and short-term local repetitive patterns between the variables, which yields additional feature information. Furthermore, a nonlinear bias term (MNB) is added to the original multi-head attention (MHA) layer in order to solve the complex nonlinear relationship between multivariate time series. These layers are applied after the CNN-skip layer and gated recurrent unite (GRU) layers, respectively. This allows the model to become more robust and automatically learn to the complex nonlinear relationship between the sequences, focus on the important feature information, and lessen the impact of redundant information on the PMs' temperature prediction task. The experimental results show that the goodness of fit (R2), root mean square error (RMSE) and mean absolute error (MAE) of the proposed model are 99.93%, 14.55% and 9.45%, respectively, and compared with LSTNet, the R2 is improved by 0.09%, the RMSE and MAE are reduced by 7.03% and 6.51%. These results confirm that the model can correctly predict the PM temperature, efficiently extract both long- and short-term patterns among the variables of the PMSM, and effectively focus on the key feature information in the intricate nonlinear series.

show abstract

“…As for the time series analysis, TST (Zerveas et al, 2021) directly adopts the canonical masked modeling paradigm, which is learning to predict the removed time points based on the remaining time points. Afterward, PatchTST (Nie et al, 2022) learns to predict the masked subseries-level patches to capture the local semantic information and reduce memory usage. However, as we stated before, directly masking time series will ruin the essential temporal variations, making the reconstruction too difficult to guide the representation learning.…”

Section: Related Workmentioning

confidence: 99%

SimMTM: A Simple Pre-Training Framework for Masked Time-Series Modeling

Dong¹,

Wu²,

Zhang³

et al. 2023

Preprint

View full text Add to dashboard Cite

Time series analysis is widely used in extensive areas. Recently, to reduce labeling expenses and benefit various tasks, self-supervised pre-training has attracted immense interest. One mainstream paradigm is masked modeling, which successfully pre-trains deep models by learning to reconstruct the masked content based on the unmasked part. However, since the semantic information of time series is mainly contained in temporal variations, the standard way of randomly masking a portion of time points will ruin vital temporal variations of time series seriously, making the reconstruction task too difficult to guide representation learning. We thus present SimMTM, a Simple pre-training framework for Masked Time-series Modeling. By relating masked modeling to manifold learning, SimMTM proposes to recover masked time points by the weighted aggregation of multiple neighbors outside the manifold, which eases the reconstruction task by assembling ruined but complementary temporal variations from multiple masked series. SimMTM further learns to uncover the local structure of the manifold helpful for masked modeling. Experimentally, SimMTM achieves state-of-theart fine-tuning performance in two canonical time series analysis tasks: forecasting and classification, covering both in-and cross-domain settings.

show abstract

A Time Series is Worth 64 Words: Long-term Forecasting with Transformers

Cited by 39 publications

References 0 publications

Toward a Foundation Model for Time Series Data

Toward a Foundation Model for Time Series Data

A new LSTNet-based temperature prediction model for permanent magnet

SimMTM: A Simple Pre-Training Framework for Masked Time-Series Modeling

Contact Info

Product

Resources

About