In large-scale, active software projects, one of the main challenges with code review is prioritizing the many Code Review Requests (CRRs) these projects receive. Prior studies have developed many Learning-to-Rank (LtR) models in support of prioritizing CRRs and adopted rich evaluation metrics to compare their performances. However, the evaluation was performed before observing the complex interactions between CRRs and reviewers, activities and activities in real-world code reviews. Such a pre-review evaluation provides few indications about how effective LtR models contribute to code reviews. This study aims to perform a post-review evaluation on LtR models for prioritizing CRRs. To establish the evaluation environment, we employ Discrete-Event Simulation (DES) paradigm-based Software Process Simulation Modeling (SPSM) to simulate real-world code review processes, together with three customized evaluation metrics. We develop seven LtR models and use the historical review orders of CRRs as baselines for evaluation. The results indicate that employing LtR can effectively help to accelerate the completion of reviewing CRRs and the delivery of qualified code changes. Among the seven LtR models, LambdaMART and AdaRank are particularly beneficial for accelerating completion and delivery, respectively. This study empirically demonstrates the effectiveness of using DES-based SPSM for simulating code review processes, the benefits of using LtR for prioritizing CRRs, and the specific advantages of several LtR models. This study provides new ideas for software organizations that seek to evaluate LtR models and other artificial intelligence-powered software techniques. Data&materials: https://figshare.com/s/a033e99cd2a61e64c8bc.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.