Summary
Learning to rank (L2R) works by constructing a ranking model from training data so that, given a new query, the model is able to generate an effective rank of the objects for the query. Almost all work in L2R focus on ranking accuracy leaving performance and scalability overlooked. However, performance is a critical factor, especially when dealing with on‐demand queries. In this scenario, Learning to Rank using association rules has been shown to be extremely effective but only at a high computational cost. In this work, we show how to exploit parallelism on rule‐based systems to: i) drastically reduce L2R training datasets using selective sampling and ii) to generate query customized ranking models on the fly. We present parallel algorithms and GPU implementations for these two tasks showing that dataset reduction takes only a few seconds with speedups up to 148x over a serial baseline, and that queries can be processed in only a few milliseconds with speedups of 1000x over a serial baseline and 29x over a parallel baseline for the best case. We also extend the implementations to work with multiple GPUs, further increasing the speedup over the baselines and showing the scalability of our proposed algorithms.