Datasets for recommender systems are few and often inadequate for the contextualized nature of news recommendation. News recommender systems are both time-and location-dependent, make use of implicit signals, and often include both collaborative and content-based components. In this paper we introduce the Adressa compact news dataset, which supports all these aspects of news recommendation. The dataset comes in two versions, the large 20M dataset of 10 weeks' traffic on Adresseavisen's news portal, and the small 2M dataset of only one week's traffic. We explain the structure of the dataset and discuss how it can be used in advanced news recommender systems.
Entity matching is the problem of identifying which records refer to the same real-world entity. It has been actively researched for decades, and a variety of different approaches have been developed. Even today, it remains a challenging problem, and there is still generous room for improvement. In recent years, we have seen new methods based upon deep learning techniques for natural language processing emerge.
In this survey, we present how neural networks have been used for entity matching. Specifically, we identify which steps of the entity matching process existing work have targeted using neural networks, and provide an overview of the different techniques used at each step. We also discuss contributions from deep learning in entity matching compared to traditional methods, and propose a taxonomy of deep neural networks for entity matching.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.