2023
DOI: 10.1145/3569930
|View full text |Cite
|
Sign up to set email alerts
|

A Critical Study on Data Leakage in Recommender System Offline Evaluation

Abstract: Recommender models are hard to evaluate, particularly under offline setting. In this paper, we provide a comprehensive and critical analysis of the data leakage issue in recommender system offline evaluation. Data leakage is caused by not observing global timeline in evaluating recommenders e.g., train/test data split does not follow global timeline. As a result, a model learns from the user-item interactions that are not expected to be available at prediction time. We first show the te… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2023
2023
2025
2025

Publication Types

Select...
7

Relationship

0
7

Authors

Journals

citations
Cited by 24 publications
(7 citation statements)
references
References 46 publications
0
7
0
Order By: Relevance
“…Therefore, the recommender results and explanations of our proposed online prediction method will be more convincing. The impact of data leakage is recognized by existing works [ 14 , 15 ], but we are the first to offer a comprehensive critical study on this issue under the explainable recommendation scenario.…”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations
“…Therefore, the recommender results and explanations of our proposed online prediction method will be more convincing. The impact of data leakage is recognized by existing works [ 14 , 15 ], but we are the first to offer a comprehensive critical study on this issue under the explainable recommendation scenario.…”
Section: Methodsmentioning
confidence: 99%
“…Data ordering refers to arranging the interactions randomly or by a timestamp and splitting focus on the ratio of training and test sets. More recently, data leakage issue caused by not observing global timeline in recommender system has attracted increasing attention [ 15 , 29 ]. The damage of data leakage is disastrous, rendering the recommendations invalid, since future interactions are used to predict current user preference.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…An equally problematic issue was recently reported in the analysis by Sun et al (2020) who found that 37% of their examined works tuned the hyperparameters of their new method on the test set. If such a procedure was admissible, it is trivial to come up with an algorithm for implicit feedback that uses a parameter for each user‐item combination and then “learns” the best setting for this variable by looking at the test set.…”
Section: The Problemsmentioning
confidence: 95%