2021
DOI: 10.1609/aaai.v35i5.16549
|View full text |Cite
|
Sign up to set email alerts
|

Noninvasive Self-attention for Side Information Fusion in Sequential Recommendation

Abstract: Sequential recommender systems aim to model users’ evolving interests from their historical behaviors, and hence make customized time-relevant recommendations. Compared with traditional models, deep learning approaches such as CNN and RNN have achieved remarkable advancements in recommendation tasks. Recently, the BERT framework also emerges as a promising method, benefited from its self-attention mechanism in processing sequential data. However, one limitation of the original BERT framework is that it only co… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
17
1

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 73 publications
(18 citation statements)
references
References 20 publications
0
17
1
Order By: Relevance
“…Moreover, we show that an appropriately trained BERT4Rec can match or outperform later models (e.g. DuoRec [47], LightSANs [12] & NOVA-BERT [35]) and therefore may still be used as a state-of-the-art sequential recommendation model.…”
Section: Introductionmentioning
confidence: 94%
See 1 more Smart Citation
“…Moreover, we show that an appropriately trained BERT4Rec can match or outperform later models (e.g. DuoRec [47], LightSANs [12] & NOVA-BERT [35]) and therefore may still be used as a state-of-the-art sequential recommendation model.…”
Section: Introductionmentioning
confidence: 94%
“…While early sequential recommender systems applied Markov Chains [50], more recently neural networks based models have been shown to outperform traditional models [20,39,54,61]. Since the arrival of the Transformer neural architecture [57] and, in particular the BERT [11] language model, Transformer-based sequential recommendations models, such as SASRec [24], S 3 Rec [69], LightSANs [12], NOVA-BERT [35] and DuoRec [47] have achieved state-of-the-art performance in next item prediction.…”
Section: Introductionmentioning
confidence: 99%
“…The multimodal features are fed into different modality encoders. The modality encoders extract the representations and are general architectures used in other fields, such as ViT [13] for images and General [34] Coarse-grained Attention CL [40] Coarse-grained Attention None [6], [21] Fine-gained Attention None [30], [27], [57] Combined Attention None [44], [39] User-item Graph + Fine-gained Attention None [56] User-item Graph CL [59] Item-item Graph CL [58], [38] Item-item Graph None [33] Item-item Graph + Fine-gained Attention None [50], [45] Knowledge Graph None [2], [46] Knowledge Graph CL [8] Knowledge Graph + Fine-gained Attention None [43] Knowledge Graph + Filtration (graph) None [63], [55], [31] Filtration (graph) None [49], [4] MLP / Concat DRL [15], [28] Fine-gained Attention DRL [61], [36], [48] None DRL…”
Section: Procedures Of Mrsmentioning
confidence: 99%
“…The multimodal data contains both global and fine-grained features, such as the tone of an audio recording or the pattern on a piece of clothing. Since coarse-grained fusion is often invasive and irreversible [27],…”
Section: 22mentioning
confidence: 99%
“…Wang et al [6] put forward an attentive model with a temporal point process to predict the next music based on the time series of listening records. Rather than put the side information directly into the model, Liu et al [35] modified the attention weights distribution by side information to adjust the items' embeddings in sequential recommendation tasks.…”
Section: Attention Mechanismmentioning
confidence: 99%