Residual Correction in Real-Time Traffic Forecasting

Kim, Daejin; Cho, Youngin; Kim, Dongmin; Park, Cheonbok; Choo, Jaegul

doi:10.1145/3511808.3557432

Cited by 14 publications

(28 citation statements)

References 8 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The number of stacked units in MTS-Mixers for capturing temporal and channel interaction is all set as 2 for a fair comparison. We adopt reversible instance normalization (Kim et al, 2022) rather than disentanglement to alleviate the distribution shift problem.…”

Section: Discussionmentioning

confidence: 99%

MTS-Mixers: Multivariate Time Series Forecasting via Factorized Temporal and Channel Mixing

Li¹,

Rao²,

Pan³

et al. 2023

Preprint

View full text Add to dashboard Cite

Multivariate time series forecasting has been widely used in various practical scenarios. Recently, Transformer-based models have shown significant potential in forecasting tasks due to the capture of long-range dependencies. However, recent studies in the vision and NLP fields show that the role of attention modules is not clear, which can be replaced by other token aggregation operations. This paper investigates the contributions and deficiencies of attention mechanisms on the performance of time series forecasting. Specifically, we find that (1) attention is not necessary for capturing temporal dependencies, (2) the entanglement and redundancy in the capture of temporal and channel interaction affect the forecasting performance, and (3) it is important to model the mapping between the input and the prediction sequence. To this end, we propose MTS-Mixers, which use two factorized modules to capture temporal and channel dependencies. Experimental results on several real-world datasets show that MTS-Mixers outperform existing Transformerbased models with higher efficiency.

show abstract

Section: Discussionmentioning

confidence: 99%

MTS-Mixers: Multivariate Time Series Forecasting via Factorized Temporal and Channel Mixing

Li¹,

Rao²,

Pan³

et al. 2023

Preprint

View full text Add to dashboard Cite

show abstract

“…In these experiments, we train models using the same setup for a fixed number of steps. All models are wrapped by RevIN [22]. The activation function used in MLPs is Swish [23,24].…”

Section: Error Power Spectrum Analysismentioning

confidence: 99%

HiFNet: rethinking time series forecasting models from a perspective of error power spectrum

Zhu,

Yuan

2023

Preprint

View full text Add to dashboard Cite

In recent years, simple models for time series forecasting task have attracted considerable attention from researchers. Recent works have revealed that a simple linear mapping is even more competitive in forecasting tasks than some well-designed models; meanwhile, MLPs can outperform linear models on datasets with a large number of channels. However, it remains unclear what the key difference is between these two architectures. In this paper, we explore the difference between linear models and MLPs from a novel perspective of error power spectrum. We analyze the inter-model and intra-training comparisons of error power spectrum and note that: 1) the error power at all frequencies is not uniformly distributed and different models have different error power spectral bias; 2) the error power at different frequencies does not necesarily converge at an equal rate. And based on these key observations, we propose a time series forecasting model called HiFNet, which stands for High-Frequency enhanced Network, and a model-agnostic ensemble learning approach called Frequency Ensemble. We conduct several experiments on different datasets and validate the effectiveness of our approaches.

show abstract

“…(Corresponding author: Li Shen) Li Shen, Yuning Wei, Yangzhu Wang and Huaxin Qiu are with Beihang University, Beijing, China. (email: shenli@buaa.edu.cn; yuning@buaa.edu.cn; wangyangzhu@buaa.edu.cn; qiuhuaxin@buaa.edu.cn) forecasting models based on TSFM excel in resisting nonstationarity brought by distribution shifts [12] and concept drifts [13]. Conversely, forecasting models based on TSFT own more complicated architecture and better capability of capturing long-term dependencies of time-series at the expense of being more vulnerable to over-fitting problem caused by non-stationarity [5].…”

Section: B Problemsmentioning

confidence: 99%