Industrial applications of recommendation systems aim at recommending top-N products that are the most appealing to their customers, often focusing on those products that customers are likely to purchase in the near future. In this experiments and analyses paper, we present an extensive experimental evaluation of various top-N collaborative filtering recommendation algorithms based on a real-world dataset of customer's purchase history provided by our business partners at TOTAL. Our study aims to compare representative collaborative filtering approaches in practice and study the ones yielding the highest recommendation accuracy, with respect to wellestablished evaluation measures. These experiments are part of the development of a promotional offers campaign for TOTAL customers owning a loyalty card. We show how different settings for training and applying the selected algorithms influence their absolute and relative performances. The results are valuable to our TOTAL partners as they constitute the first large-scale analysis of recommendation algorithms in the context of their datasets. In particular, the study of the impact of recency in the training set and the role of customer activity and of context in recommendation shed light on a finer design of promotional product campaigns.
Our focus in this experimental analysis paper is to investigate existing measures that are available to rank association rules and understand how they can be augmented further to enable real-world decision support as well as providing customers with personalized recommendations. For example, by analyzing receipts of TOTAL customers, one can find that, customers who buy windshield wash, also buy engine oil and energy drinks or middle-aged customers from the South of France subscribe to a car wash program. Such actionable insights can immediately guide business decision making, e.g., for product promotion, product recommendation or targeted advertising. We present an analysis of 30 million unique sales receipts, spanning 35 million records, by almost 1 million customers, generated at 3,463 gas stations, over three years. Our finding is that the 35 commonly used measures to rank association rules, such as Confidence and Piatetsky-Shapiro, can be summarized into 5 synthesized clusters based on similarity in their rankings. We then use one representative measure in each cluster to run a user study with a data scientist and a product manager at TOTAL. Our analysis draws actionable insights to enable decision support for TOTAL decision makers: rules that favor Confidence are best to determine which products to recommend and rules that favor Recall are well-suited to find customer segments to target. Finally, we present how association rules using the representative measures can be used to provide customers with personalized product recommendations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.