Code review is an essential element of any mature software development project; it aims at evaluating code contributions submitted by developers. In principle, code review should improve the quality of code changes (patches) before they are committed to the project's master repository. In practice, bugs are sometimes unwittingly introduced during this process.In this paper, we report on an empirical study investigating code review quality for Mozilla, a large open-source project. We explore the relationships between the reviewers' code inspections and a set of factors, both personal and social in nature, that might affect the quality of such inspections. We applied the SZZ algorithm to detect bug-inducing changes that were then linked to the code review information extracted from the issue tracking system. We found that 54% of the reviewed changes introduced bugs in the code. Our findings also showed that both personal metrics, such as reviewer workload and experience, and participation metrics, such as the number of involved developers, are associated with the quality of the code review process.
Abstract-When submitting a patch, the primary concerns of individual developers are "How can I maximize the chances of my patch being approved, and minimize the time it takes for this to happen?" In principle, code review is a transparent process that aims to assess qualities of the patch by their technical merits and in a timely manner; however, in practice the execution of this process can be affected by a variety of factors, some of which are external to the technical content of the patch itself. In this paper, we describe an empirical study of the code review process for WebKit, a large, open source project; we replicate the impact of previously studied factors -such as patch size, priority, and component and extend these studies by investigating organizational (the company) and personal dimensions (reviewer load and activity, patch writer experience) on code review response time and outcome. Our approach uses a reverse engineered model of the patch submission process and extracts key information from the issue tracking and code review systems. Our findings suggest that these nontechnical factors can significantly impact code review outcomes.
Organizations are generating, processing, and retaining data at a rate that often exceeds their ability to analyze it effectively; at the same time, the insights derived from these large data sets are often key to the success of the organizations, allowing them to better understand how to solve hard problems and thus gain competitive advantage. Because this data is so fast-moving and voluminous, it is increasingly impractical to analyze using traditional offline, read-only relational databases.Recently, new "big data" technologies and architectures, including Hadoop and NoSQL databases, have evolved to better support the needs of organizations analyzing such data. In particular, Elasticsearch -a distributed full-text search engine -explicitly addresses issues of scalability, big data search, and performance that relational databases were simply never designed to support. In this paper, we reflect upon our own experience with Elasticsearch and highlight its strengths and weaknesses for performing modern mining software repositories research.
Abstract-The goal of the code review process is to assess the quality of source code modifications (submitted as patches) before they are committed to a project's version control repository. This process is particularly important in open source projects to ensure the quality of contributions submitted by the community; however, the review process can promote or discourage these contributions. In this paper, we study the patch lifecycle of the Mozilla Firefox project. The model of a patch lifecycle was extracted from both the qualitative evidence of the individual processes (interviews and discussions with developers), and the quantitative assessment of the Mozilla process and practice. We contrast the lifecycle of a patch in preand post-rapid release development. A quantitative comparison showed that while the patch lifecycle remains mostly unchanged after switching to rapid release, the patches submitted by casual contributors are disproportionately more likely to be abandoned compared to core contributors. This suggests that patches from casual developers should receive extra care to both ensure quality and encourage future community contributions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with đź’™ for researchers
Part of the Research Solutions Family.