This work formulates a novel song recommender system 1 as a matrix completion problem that benefits from collaborative filtering through Non-negative Matrix Factorization (NMF) and content-based filtering via total variation (TV) on graphs. The graphs encode both playlist proximity information and song similarity, using a rich combination of audio, meta-data and social features. As we demonstrate, our hybrid recommendation system is very versatile and incorporates several well-known methods while outperforming them. Particularly, we show on real-world data that our model overcomes w.r.t. two evaluation metrics the recommendation of models solely based on low-rank information, graph-based information or a combination of both.
In this work, we propose a new, fast and scalable method for anomaly detection in large time-evolving graphs. It may be a static graph with dynamic node attributes (e.g. time-series), or a graph evolving in time, such as a temporal network. We define an anomaly as a localized increase in temporal activity in a cluster of nodes. The algorithm is unsupervised. It is able to detect and track anomalous activity in a dynamic network despite the noise from multiple interfering sources.We use the Hopfield network model of memory to combine the graph and time information. We show that anomalies can be spotted with a good precision using a memory network. The presented approach is scalable and we provide a distributed implementation of the algorithm.To demonstrate its efficiency, we apply it to two datasets: Enron Email dataset and Wikipedia page views. We show that the anomalous spikes are triggered by the real-world events that impact the network dynamics. Besides, the structure of the clusters and the analysis of the time evolution associated with the detected events reveals interesting facts on how humans interact, exchange and search for information, opening the door to new quantitative studies on collective and social behavior on large and dynamic datasets.
Graphs are now ubiquitous in almost every field of research. Recently, new research areas devoted to the analysis of graphs and data associated to their vertices have emerged. Focusing on dynamical processes, we propose a fast, robust and scalable framework for retrieving and analyzing recurring patterns of activity on graphs. Our method relies on a novel type of multilayer graph that encodes the spreading or propagation of events between successive time steps. We demonstrate the versatility of our method by applying it on three different real-world examples. Firstly, we study how rumor spreads on a social network. Secondly, we reveal congestion patterns of pedestrians in a train station. Finally, we show how patterns of audio playlists can be used in a recommender system. In each example, relevant information previously hidden in the data is extracted in a very efficient manner, emphasizing the scalability of our method. With a parallel implementation scaling linearly with the size of the dataset, our framework easily handles millions of nodes on a single commodity server.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.