Many software testing elds, like white-box testing, test case generation, test prioritization and fault localization, depend on code coverage measurement. If used as an overall completeness measure, the minor inaccuracies of coverage data reported by a tool do not matter that much; however, in certain situations they can lead to serious confusion. For example, a code element that is falsely reported as covered can introduce false condence in the test. This work investigates code coverage measurement issues for the Java programming language. For Java, the prevalent approach to code coverage measurement is using bytecode instrumentation due to its various benets over source code instrumentation. As we have experienced, bytecode instrumentation-based code coverage tools produce dierent results than source code instrumentation-based ones in terms of the reported items as covered. We report on an empirical study to compare the code coverage results provided by tools using the dierent instrumentation types for Java coverage measurement on the method level. In particular, we want to nd out how much a bytecode instrumentation approach is inaccurate compared to a source code instrumentation method. The dierences are systematically investigated both in quantitative (how much the outputs dier) and in qualitative terms (what the causes for the dierences are). In addition, the impact on test prioritization and test suite reduction a possible application of coverage measurement is investigated in more detail as well. Keywords Code coverage • white-box testing • Java bytecode instrumentation • source code instrumentation • coverage tools • empirical study The nal publication is available at Springer via