Since benchmarks drive computer science research and industry product development, which ones we use and how we evaluate them are key questions for the community. Despite complex runtime tradeoffs due to dynamic compilation and garbage collection required for Java programs, many evaluations still use methodologies developed for C, C++, and Fortran. SPEC, the dominant purveyor of benchmarks, compounded this problem by institutionalizing these methodologies for their Java benchmark suite. This paper recommends benchmarking selection and evaluation methodologies, and introduces the DaCapo benchmarks, a set of open source, client-side Java benchmarks. We demonstrate that the complex interactions of (1) architecture, (2) compiler, (3) virtual machine, (4) memory management, and (5) application require more extensive evaluation than C, C++, and Fortran which stress (4) much less, and do not require (3). We use and introduce new value, time-series, and statistical metrics for static and dynamic properties such as code complexity, code size, heap composition, and pointer mutations. No benchmark suite is definitive, but these metrics show that DaCapo improves over SPEC Java in a variety of ways, including more complex code, richer object behaviors, and more demanding memory system requirements. This paper takes a step towards improving methodologies for choosing and evaluating benchmarks to foster innovation in system design and implementation for Java and other managed languages.
Pointer analysis benefits many useful clients, such as compiler optimizations and bug finding tools. Unfortunately, common programming language features, such as dynamic loading, reflection, and foreign language interfaces, make pointer analysis difficult. This paper describes how to deal with these features by performing pointer analysis online, during program execution. For example, dynamic loading may load code that is not available for analysis before the program starts.Only an online analysis can analyze such code, and thus support clients that optimize it or find bugs in it. This paper identifies all problems in performing Andersen's pointer analysis for the full Java language, presents solutions to these problems, and uses a full implementation of the solutions in a Java virtual machine for validation and performance evaluation. Our analysis is fast: on average over our benchmark suite, if the analysis recomputes points-to results upon each program change, most analysis pauses take under 0.1 seconds, and add up to 64.5 seconds.
Generics offer significant software engineering benefits since they provide code reuse without compromising type safety. Thus generics will be added to the Java language in the next release. While this extension to Java will help programmers when they are writing new code, it will not help legacy code unless it is rewritten to use generics. In our experience, manually modifying existing programs to use generics is complex and can be error prone and labor intensive. We describe a system, Ilwith, that (i) converts non-generic classes to generic classes and (ii) rewrites their clients to use the newly generified classes. Our experiments with a number of Java container classes show that our system is effective in modifying legacy code to use generics.
Evaluation methodology underpins all innovation in experimental computer science. It requires relevant workloads, appropriate experimental design, and rigorous analysis. Unfortunately, methodology is not keeping pace with the changes in our field. The rise of managed languages such as Java, C#, and Ruby in the past decade and the imminent rise of commodity multicore architectures for the next decade pose new methodological challenges that are not yet widely understood. This paper explores the consequences of our collective inattention to methodology on innovation, makes recommendations for addressing this problem in one domain, and provides guidelines for other domains. We describe benchmark suite design, experimental design, and analysis for evaluating Java applications. For example, we introduce new criteria for measuring and selecting diverse applications for a benchmark suite. We show that the complexity and nondeterminism of the Java runtime system make experimental design a first-order consideration, and we recommend mechanisms for addressing complexity and nondeterminism. Drawing on these results, we suggest how to adapt methodology more broadly. To continue to deliver innovations, our field needs to significantly increase participation in and funding for developing sound methodological foundations.
Generics offer significant software engineering benefits since they provide code reuse without compromising type safety. Thus generics will be added to the Java language in the next release. While this extension to Java will help programmers when they are writing new code, it will not help legacy code unless it is rewritten to use generics. In our experience, manually modifying existing programs to use generics is complex and can be error prone and labor intensive. We describe a system, Ilwith, that (i) converts non-generic classes to generic classes and (ii) rewrites their clients to use the newly generified classes. Our experiments with a number of Java container classes show that our system is effective in modifying legacy code to use generics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.