Proceedings of the Web Conference 2021 2021
DOI: 10.1145/3442381.3449878
|View full text |Cite
|
Sign up to set email alerts
|

Bidirectional Distillation for Top-K Recommender System

Abstract: Recommender systems (RS) have started to employ knowledge distillation, which is a model compression technique training a compact model (student) with the knowledge transferred from a cumbersome model (teacher). The state-of-the-art methods rely on unidirectional distillation transferring the knowledge only from the teacher to the student, with an underlying assumption that the teacher is always superior to the student. However, we demonstrate that the student performs better than the teacher on a significant … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

1
24
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 31 publications
(25 citation statements)
references
References 24 publications
1
24
0
Order By: Relevance
“…(1) KD by the predictions. Motivated by [5] that matches the class distributions, most existing methods [8,10,12,22,26] have focused on matching the predictions (i.e., recommendation results) from the teacher and the student. The teacher's predictions convey additional information about the subtle difference among the items, helping the student generalize better than directly learning from binary labels [12].…”
Section: Related Workmentioning
confidence: 99%
See 4 more Smart Citations
“…(1) KD by the predictions. Motivated by [5] that matches the class distributions, most existing methods [8,10,12,22,26] have focused on matching the predictions (i.e., recommendation results) from the teacher and the student. The teacher's predictions convey additional information about the subtle difference among the items, helping the student generalize better than directly learning from binary labels [12].…”
Section: Related Workmentioning
confidence: 99%
“…Since a user is interested in only a few items, distilling knowledge of a few top-ranked items is effective to discover the user's preferable items [22]. Most recently, [10] utilizes rankdiscrepancy information between the predictions from the teacher and the student. Specifically, [10] focuses on distilling the knowledge of the items ranked highly by the teacher but ranked lowly by the student.…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations