Muformer: A long sequence time-series forecasting model based on modified multi-head attention

Zeng, Pengyu; Hu, Guoliang; Zhou, Xiaofeng; Ли, Шуай; Liu, Pengjie; Liu, Shurui

doi:10.1016/j.knosys.2022.109584

Cited by 25 publications

(5 citation statements)

References 22 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…3 , The multi-head attention mechanism achieves this goal by mapping the input sequence to a query (Q), key (K), and value (V) vector, respectively. Subsequently, it employs an attention-scoring function to compute the attentional weights of each position relative to the other positions, effectively capturing dependencies across the input sequence 47 , 48 . Finally, the output of each attention head is spliced together and then passed through a linear layer to obtain the final output..

Figure 3 Structure of the multi-head attention mechanism model.…”

Section: Methodsmentioning

confidence: 99%

Enhanced coalbed methane well production prediction framework utilizing the CNN-BL-MHA approach

Li,

Xie

et al. 2024

Sci Rep

View full text Add to dashboard Cite

As the mechanization of the CBM extraction process advances and geological conditions continuously evolve, the production data from CBM wells is deviating increasingly from linearity, thereby presenting a significant challenge in accurately predicting future gas production from these wells. When it comes to predicting the production of CBM, a single deep-learning model can face several drawbacks such as overfitting, gradient explosion, and gradient disappearance. These issues can ultimately result in insufficient prediction accuracy, making it important to carefully consider the limitations of any given model. It’s impressive to see how advanced technology can enhance the prediction accuracy of CBM. In this paper, the use of a CNN model to extract features from CBM well data and combine it with Bi-LSTM and a Multi-Head Attention mechanism to construct a production prediction model for CBM wells—the CNN-BL-MHA model—is fascinating. It is even more exciting that predictions of gas production for experimental wells can be conducted using production data from Wells W1 and W2 as the model’s database. We compared and analyzed the prediction results obtained from the CNN-BL-MHA model we constructed with those from single models like ARIMA, LSTM, MLP, and GRU. The results show that the CNN-BL-MHA model proposed in the study has shown promising results in improving the accuracy of gas production prediction for CBM wells. It’s also impressive that this model demonstrated super stability, which is essential for reliable predictions. Compared to the single deep learning model used in this study, its prediction accuracy can be improved up to 35%, and the prediction results match the actual yield data with lower error.

show abstract

Figure 3 Structure of the multi-head attention mechanism model.…”

Section: Methodsmentioning

confidence: 99%

Enhanced coalbed methane well production prediction framework utilizing the CNN-BL-MHA approach

Li,

Xie

et al. 2024

Sci Rep

View full text Add to dashboard Cite

show abstract

“…The Transformer model proposed by Vaswani et al [20] has achieved tremendous success in natural language processing tasks, and Li et al [21] applied it to time series forecasting, addressing the issue of memory bottlenecks. However, traditional Transformer models exhibit high time and space complexity when dealing with long sequences, which limits their practical application in time series forecasting tasks [22] [23]. Therefore, in this paper, we propose a Patched Time Series Transformer model with independent channels to address this problem.…”

Section: Related Workmentioning

confidence: 99%

AEformer: Asymmetric Embedding Transformer for Stock Market Prediction based on Investor Sentiment

Jiang,

Yue,

Zhang

et al. 2023

Preprint

View full text Add to dashboard Cite

Stock market prediction is an essential topic in economics. However, owing to the noise and volatility of the stock market, timely market prediction is generally considered one of the most challenging problems. Several researchers have introduced investor sentiment into stock prediction models and have achieved good results. Applying investor sentiment to high-frequency stock price forecasts can lead to risk aversion and improved returns. We have designed a model for high-frequency stock price prediction known as asymmetric embedding transformer (AEformer) that uses investor sentiment. We filtered stock comments using category information and enhanced the utilization of investor sentiment for stock prediction by incorporating an asymmetric embedding layer combined with a channel-wise independent self-attention mechanism. The experimental results show that AEformer outperforms the other models in high-frequency stock predictions using investor sentiment. Moreover, the asymmetric embedding layer is effective in improving the forecasting performance of transformer-based models.

show abstract

“…Time series forecasting plays a vital role in various domains such as finance (Ma et al 2022), weather forecasting (Liu et al 2022a), and sensor data analysis (Zhao et al 2023). Extracting meaningful patterns, understanding the underlying dynamics of time series to forecast future trends are crucial for informed decision-making and effective problemsolving (Zhang, Guo, and Wang 2023). With the advent of deep learning, convolutional neural networks (CNNs) (Fukushima 1980) and Transformers (Vaswani et al 2017) have shown remarkable progress in capturing temporal dependencies and extracting features from time series.…”

Section: Introductionmentioning

confidence: 99%

U-Mixer: An Unet-Mixer Architecture with Stationarity Correction for Time Series Forecasting

Ma,

Li,

Fang

et al. 2024

AAAI

View full text Add to dashboard Cite

Time series forecasting is a crucial task in various domains. Caused by factors such as trends, seasonality, or irregular fluctuations, time series often exhibits non-stationary. It obstructs stable feature propagation through deep layers, disrupts feature distributions, and complicates learning data distribution changes. As a result, many existing models struggle to capture the underlying patterns, leading to degraded forecasting performance. In this study, we tackle the challenge of non-stationarity in time series forecasting with our proposed framework called U-Mixer. By combining Unet and Mixer, U-Mixer effectively captures local temporal dependencies between different patches and channels separately to avoid the influence of distribution variations among channels, and merge low- and high-levels features to obtain comprehensive data representations. The key contribution is a novel stationarity correction method, explicitly restoring data distribution by constraining the difference in stationarity between the data before and after model processing to restore the non-stationarity information, while ensuring the temporal dependencies are preserved. Through extensive experiments on various real-world time series datasets, U-Mixer demonstrates its effectiveness and robustness, and achieves 14.5% and 7.7% improvements over state-of-the-art (SOTA) methods.

show abstract

Muformer: A long sequence time-series forecasting model based on modified multi-head attention

Cited by 25 publications

References 22 publications

Enhanced coalbed methane well production prediction framework utilizing the CNN-BL-MHA approach

Enhanced coalbed methane well production prediction framework utilizing the CNN-BL-MHA approach

AEformer: Asymmetric Embedding Transformer for Stock Market Prediction based on Investor Sentiment

U-Mixer: An Unet-Mixer Architecture with Stationarity Correction for Time Series Forecasting

Contact Info

Product

Resources

About