Software testing is crucial in continuous integration (CI). Ideally, at every commit, all the test cases should be executed, and moreover, new test cases should be generated for the new source code. This is especially true in a Continuous Test Generation (CTG) environment, where the automatic generation of test cases is integrated into the continuous integration pipeline. In this context, developers want to achieve a certain minimum level of coverage for every software build. However, executing all the test cases and, moreover, generating new ones for all the classes at every commit is not feasible. As a consequence, developers have to select which subset of classes has to be tested and/or targeted by test-case generation. We argue that knowing a priori the branch coverage that can be achieved with test-data generation tools can help developers into taking informed decision about those issues. In this paper, we investigate the possibility to use source-code metrics to predict the coverage achieved by test-data generation tools. We use four different categories of source-code features and assess the prediction on a large data set involving more than 3'000 Java classes. We compare different machine learning algorithms and conduct a fine-grained feature analysis aimed at investigating the factors that most impact the prediction accuracy. Moreover, we extend our investigation to four different search budgets. Our evaluation shows that the best model achieves an average 0.15 and 0.21 MAE on nested cross-validation over the different budgets, respectively, on EVOSUITE and RANDOOP. Finally, the discussion of the results demonstrate the relevance of coupling-related features for the prediction accuracy. KEYWORDS automated software testing, coverage prediction, machine learning, software testing
INTRODUCTIONSoftware testing is widely recognized as a crucial task in any software development process, 1 estimated at being at least about half of the entire development cost. 2,3 In the last years, we witnessed a wide adoption of continuous integration (CI) practices, where new or changed code is integrated extremely frequently into the main codebase. Testing plays an important role in such a pipeline: In an ideal world, at every single commit, every system's test case should be executed (regression testing). Moreover, additional test cases might be automatically generated to test all the new -or modified-code introduced into the main codebase. 4 This is especially true in a Continuous Test Generation (CTG) environment, where the generation of test cases is directly integrated into the continuous integration cycle. 4 However, because of the time constraints between frequent commits, a complete regression testing is not feasible for large projects. 5 Furthermore, even test suite augmentation, 6 ie, the automatic generation considering code changes and their effect on the previous codebase, is hardly doable because of the extensive amount of time needed to generate tests for just a single class. As developers want to ensure a cer...