2023
DOI: 10.1109/tkde.2021.3126456
|View full text |Cite
|
Sign up to set email alerts
|

A General Survey on Attention Mechanisms in Deep Learning

Abstract: Attention is an important mechanism that can be employed for a variety of deep learning models across many different domains and tasks. This survey provides an overview of the most important attention mechanisms proposed in the literature. The various attention mechanisms are explained by means of a framework consisting of a general attention model, uniform notation, and a comprehensive taxonomy of attention mechanisms. Furthermore, the various measures for evaluating attention models are reviewed, and methods… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
90
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 220 publications
(90 citation statements)
references
References 87 publications
0
90
0
Order By: Relevance
“…Consider a scoring function f defined: 𝑓: ℝ × ℝ ↦ ℝ, which determines the assigned weight to its vector and indicates the important vectors. Therefore, the context vector (𝑐 ) is a weighted sum of hidden states H = {ℎ , ℎ , … , ℎ } and represents the important information for the current time step as depicted in the following equations [52]: Consider a scoring function f defined: f : R m × R m → R , which determines the assigned weight to its vector and indicates the important vectors. Therefore, the context vector (c t ) is a weighted sum of hidden states H = {h 1 , h 2 , .…”
Section: Attention Layersmentioning
confidence: 99%
See 2 more Smart Citations
“…Consider a scoring function f defined: 𝑓: ℝ × ℝ ↦ ℝ, which determines the assigned weight to its vector and indicates the important vectors. Therefore, the context vector (𝑐 ) is a weighted sum of hidden states H = {ℎ , ℎ , … , ℎ } and represents the important information for the current time step as depicted in the following equations [52]: Consider a scoring function f defined: f : R m × R m → R , which determines the assigned weight to its vector and indicates the important vectors. Therefore, the context vector (c t ) is a weighted sum of hidden states H = {h 1 , h 2 , .…”
Section: Attention Layersmentioning
confidence: 99%
“…. , h t−1 } and represents the important information for the current time step as depicted in the following equations [52]:…”
Section: Attention Layersmentioning
confidence: 99%
See 1 more Smart Citation
“…Meanwhile, attention mechanism focuses on the important information of an image, which is more expandable and robust ( Brauwers and Frasincar, 2021 ) that improves the performance of CNN models. Different attention modules have been employed.…”
Section: Deep Learning For the Chinese Herbal Slices Image Recognitionmentioning
confidence: 99%
“…The New York officers drove to Poughkeepsie , N.Y. , from Albany , then took a train to Grand Central Although great success has been achieved in previous work on extracting entities and relations, most relation extraction models ignore the fact that subjects contain rich semantic information on objects and relations that has immediate relevance to the relation extraction. Moreover, Wei et al [11] and Sun et al [12] have proven that fusing the relevant representations into context representation enables the enhancement of model performance in relation extraction, and the attention model has proven its effectiveness in representation fusion [13]. Furthermore, Lai et al [14] have utilized an attention model to extract relational triplets, further proving its effectiveness in relation extraction.…”
Section: Seomentioning
confidence: 99%