2021
DOI: 10.1609/aaai.v35i8.16875
|View full text |Cite
|
Sign up to set email alerts
|

Continuous-Time Attention for Sequential Learning

Abstract: Attention mechanism is crucial for sequential learning where a wide range of applications have been successfully developed. This mechanism is basically trained to spotlight on the region of interest in hidden states of sequence data. Most of the attention methods compute the attention score through relating between a query and a sequence where the discrete-time state trajectory is represented. Such a discrete-time attention could not directly attend the continuous-time trajectory which is represented via neura… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 12 publications
(3 citation statements)
references
References 23 publications
0
3
0
Order By: Relevance
“…However, along with recurrent units comes the unstable gradient issue, and difficulties in long sequence modeling and parallelizing (Lipton, Berkowitz, and Elkan 2015). Another group of work introduces attention mechanisms into models for irregular time series (Chien and Chen 2021;Horn et al 2020;Shukla and Marlin 2021a;Tipirneni and Reddy 2022). For example, Raindrop (Zhang et al 2022) combines attention with graph neural networks to model irregularity.…”
Section: Background and Related Workmentioning
confidence: 99%
“…However, along with recurrent units comes the unstable gradient issue, and difficulties in long sequence modeling and parallelizing (Lipton, Berkowitz, and Elkan 2015). Another group of work introduces attention mechanisms into models for irregular time series (Chien and Chen 2021;Horn et al 2020;Shukla and Marlin 2021a;Tipirneni and Reddy 2022). For example, Raindrop (Zhang et al 2022) combines attention with graph neural networks to model irregularity.…”
Section: Background and Related Workmentioning
confidence: 99%
“…Early works tackled lifelong learning (with no knowledge preservation, hence no CF) for sentiment analysis (Carlson et al 2010;Silver, Yang, and Li 2013;Ruvolo and Eaton 2013;Chen, Ma, and Liu 2015;Wang et al 2019;Qin, Hu, and Liu 2020;Wang et al 2018). Recent works have dealt with CF in many applications: sentiment analysis (Lv et al 2019;Ke et al 2021b;Ke, Xu, and Liu 2021), dialogue systems (Shen, Zeng, and Jin 2019;Madotto et al 2020;Qian, Wei, and Yu 2021;Chien and Chen 2021), language modeling (Sun, Ho, and Lee 2019;Chuang, Su, and Chen 2020) and learning (Li et al 2019), cross-lingual modeling (Liu et al 2020), sentence embedding (Liu, Ungar, and Sedoc 2019), machine translation (Khayrallah et al 2018;Zhan et al 2021), question answering (Greco et al 2019), named entity recognition (Monaikul et al 2021).…”
Section: Related Workmentioning
confidence: 99%
“…For example, neural ODE [7] parameterizes hidden state derivatives by regarding time as a variable, subsequently solving the initial value problem. Moreover, Latent ODE [8] and its variants [9], [14]- [16] are recommended for handling irregularly-sampled time-series data. In particular, ME-NODE [10] has proven its capability in analyzing AD progression through a probabilistic model incorporating mixed effects.…”
Section: A Rnn-based Ad Progression Modelingmentioning
confidence: 99%