Regression testing is a crucial part of software development. It checks that software changes do not break existing functionality. An important assumption of regression testing is that test outcomes are deterministic: an unmodified test is expected to either always pass or always fail for the same code under test. Unfortunately, in practice, some testsoften called flaky tests-have non-deterministic outcomes. Such tests undermine the regression testing as they make it difficult to rely on test results.We present the first extensive study of flaky tests. We study in detail a total of 201 commits that likely fix flaky tests in 51 open-source projects. We classify the most common root causes of flaky tests, identify approaches that could manifest flaky behavior, and describe common strategies that developers use to fix flaky tests. We believe that our insights and implications can help guide future research on the important topic of (avoiding) flaky tests.
Particle physics has an ambitious and broad experimental programme for the coming decades. This programme requires large investments in detector hardware, either to build new facilities and experiments, or to upgrade existing ones. Similarly, it requires commensurate investment in the R&D of software to acquire, manage, process, and analyse the shear amounts of data to be recorded. In planning for the HL-LHC in particular, it is critical that all of the collaborating stakeholders agree on the software goals and priorities, and that the efforts complement each other. In this spirit, this white paper describes the R&D activities required to prepare for this software upgrade.
Regression test selection (RTS) aims to reduce regression testing time by only re-running the tests affected by code changes. Prior research on RTS can be broadly split into dynamic and static techniques. A recently developed dynamic RTS technique called Ekstazi is gaining some adoption in practice, and its evaluation shows that selecting tests at a coarser, class-level granularity provides better results than selecting tests at a finer, method-level granularity. As dynamic RTS is gaining adoption, it is timely to also evaluate static RTS techniques, some of which were proposed over three decades ago but not extensively evaluated on modern software projects.This paper presents the first extensive study that evaluates the performance benefits of static RTS techniques and their safety; a technique is safe if it selects to run all tests that may be affected by code changes. We implemented two static RTS techniques, one class-level and one methodlevel, and compare several variants of these techniques. We also compare these static RTS techniques against Ekstazi, a state-of-the-art, class-level, dynamic RTS technique. The experimental results on 985 revisions of 22 open-source projects show that the class-level static RTS technique is comparable to Ekstazi, with similar performance benefits, but at the risk of being unsafe sometimes. In contrast, the method-level static RTS technique performs rather poorly.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.