2022
DOI: 10.48550/arxiv.2202.10842
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

KuaiRec: A Fully-observed Dataset and Insights for Evaluating Recommender Systems

Chongming Gao,
Shijun Li,
Wenqiang Lei
et al.

Abstract: Recommender systems are usually developed and evaluated on the historical user-item logs. However, most offline recommendation datasets are highly sparse and contain various biases, which hampers the evaluation of recommendation policies. Existing efforts aim to improve the data quality by collecting users' preferences on randomly selected items (e.g., Yahoo! [36] and Coat [47]). However, they still suffer from the high variance issue caused by the sparsely observed data. To fundamentally solve the problem, w… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
3

Relationship

2
1

Authors

Journals

citations
Cited by 3 publications
(6 citation statements)
references
References 46 publications
0
6
0
Order By: Relevance
“…KuaiEnv is created by us on the KuaiRec dataset 3 [12]. KuaiRec is a real-world dataset that contains a fully observed user-item interaction matrix, which means each user has viewed each video and then left feedback.…”
Section: Recommendation Environmentsmentioning
confidence: 99%
See 2 more Smart Citations
“…KuaiEnv is created by us on the KuaiRec dataset 3 [12]. KuaiRec is a real-world dataset that contains a fully observed user-item interaction matrix, which means each user has viewed each video and then left feedback.…”
Section: Recommendation Environmentsmentioning
confidence: 99%
“…For pre-learning the user model đťś™ đť‘€ , we use the additional sparse user-video interactions in the big matrix. For the details of the data, please refer to the KuaiRec dataset [12].…”
Section: Recommendation Environmentsmentioning
confidence: 99%
See 1 more Smart Citation
“…We have conducted our experiments on three publicly-available real-world datasets for recommendation: MovieLens-1m 5 , Amazon-14core 6 , and KuaiRec-binary 7 [8]. Among them, MovieLens is a well-known classic which has been extensively studied, while KuaiRec is a fairly new one that just emerged earlier this year.…”
Section: Data and Codementioning
confidence: 99%
“…The fundamental problem lies in the fact that we have no knowledge about the massive missing interactions in the offline data [6].…”
Section: Open Problem: Collecting Random Datamentioning
confidence: 99%