2015
DOI: 10.1145/2827872
|View full text |Cite
|
Sign up to set email alerts
|

The MovieLens Datasets

Abstract: The MovieLens datasets are widely used in education, research, and industry. They are downloaded hundreds of thousands of times each year, reflecting their use in popular press programming books, traditional and online courses, and software. These datasets are a product of member activity in the MovieLens movie recommendation system, an active research platform that has hosted many experiments since its launch in 1997. This article documents the history of MovieLens and the MovieLens datasets. We include a dis… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
343
0
4

Year Published

2016
2016
2024
2024

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 2,450 publications
(461 citation statements)
references
References 34 publications
1
343
0
4
Order By: Relevance
“…For low diversity, the algorithm simply selected the N items closest to the centroid. To test our algorithm we ran several simulations and tests using the 10 M movielens (Harper and Konstan 2015) dataset. As an initial starting set, the Top-200, or the 200 items with highest predicted rating was found to provide a good balance between maximum range in predicted rating and maximum diversity.…”
Section: Diversification Algorithmmentioning
confidence: 99%
“…For low diversity, the algorithm simply selected the N items closest to the centroid. To test our algorithm we ran several simulations and tests using the 10 M movielens (Harper and Konstan 2015) dataset. As an initial starting set, the Top-200, or the 200 items with highest predicted rating was found to provide a good balance between maximum range in predicted rating and maximum diversity.…”
Section: Diversification Algorithmmentioning
confidence: 99%
“…In our experiments, we have used the publicly available MovieLens 1M dataset [25]. This dataset consists of 1,000,209 ratings by 6040 MovieLens users of approximately 3900 movies, which contains all genuine users.…”
Section: Datasetmentioning
confidence: 99%
“…In this paper, we propose a technique to detect profile injection attacks using Random Forest Classifier and conduct experiments on the MovieLens dataset [25] -1M to verify the effectiveness of the proposed approach by comparing it with different classifier techniques. The paper is organized as follows: Section 2 describes the related work, in Section 3 we discuss details about our proposed approach, Section 4 deals with the experiments performed and their analysis and finally in Section 5 we conclude the paper along with the possible future scope.…”
Section: Introductionmentioning
confidence: 99%
“…Four of these datasets are obtained from Amazon [8], [9], three from MovieLens [10], [11], and one from Netflix [12]. These eight datasets used in our experiments (a) contain reliable timestamps (most of the ratings within each dataset have been entered in real rating time and not in a batch mode), (b) are up to date (published between 1998 and 2016), (c) are widely used as benchmarking datasets in CF research and (d) vary with respect to type of dataset (movies, music, videogames and books) and size (from 2MB, up to 4.7GB).…”
Section: Performance Evaluationmentioning
confidence: 99%
“…The proposed algorithm, as well as the two algorithms presented in [7], are based on the exploitation of timestamp information which is associated with ratings; hence in this work, we use the Amazon datasets [8], [9], the MovieLens datasets [10], [11] and the Netflix dataset [12], which include the ratings' timestamps. It is worth noting that the proposed algorithm can be combined with other techniques that have been proposed for either improving prediction accuracy in CF-based systems, including consideration of social network data (e.g.…”
Section: Introductionmentioning
confidence: 99%