Traditional method-based just-in-time (JIT) compilation translates whole methods to optimized machine code. Trace-based compilation only generates machine code for frequently executed paths, so-called traces, that may span multiple methods.In this paper, we present our implementation of a trace-based JIT compiler in which we modified the mature, method-based Java HotSpot client compiler. To simplify trace recording, we added a bytecode preprocessing step that detects and directly marks loops within the bytecodes. We duplicated the existing bytecode interpreter and instrumented it for trace recording. In our implementation, traces can be anchored both at loop headers and at method entries. When a trace anchor has been executed frequently enough, trace recording is started. After several times of trace recording, our modified JIT compiler merges the recorded traces into a structure suitable for compilation. During compilation, trace-specific optimizations are applied and guarded with runtime checks if necessary. The generated machine code is then invoked by the interpreter or by already compiled traces. If a method part must be executed that was not covered by traces and therefore not compiled, or if a runtime guard fails, execution falls back to the interpreter.Benchmarks show an improved performance, less generated machine code and faster compilation compared to the methodbased Java HotSpot client compiler. The peak performance of the SPECjvm2008 benchmarks is increased by 9% on average, while 29% less machine code is generated. Similarly, the performance of the SPECjbb2005 benchmark is increased by 13% and the DaCapo 9.12 Bach benchmarks show a peak performance increase of 5% on average.
Method inlining is one of the most important optimizations in method-based just-in-time (JIT) compilers. It widens the compilation scope and therefore allows optimizing multiple methods as a whole, which increases the performance. However, if method inlining is used too frequently, the compilation time increases and too much machine code is generated. This impacts the performance negatively.Trace-based JIT compilers only compile frequently executed paths, so-called traces, instead of whole methods. This may result in faster compilation, less generated machine code, and better optimized machine code. In previous work [7], we implemented a trace recording infrastructure and a trace-based compiler for Java, by modifying the Java HotSpot VM. Based on this work, we evaluate the effect of trace inlining on the performance and size of generated machine code.Trace inlining has several major advantages when compared to method inlining. First, trace inlining is more selective than method inlining, as only frequently executed paths are inlined. Second, the recorded traces may contain information about virtual calls, which simplifies inlining. A third advantage is that trace information is context-sensitive, so that different method parts can be inlined depending on the specific call site. These advantages allow to perform aggressive inlining while the size of the generated machine code is still reasonable.
Method inlining is one of the most important optimizations in method-based just-in-time (JIT) compilers. It widens the compilation scope and therefore allows optimizing multiple methods as a whole, which increases the performance. However, if method inlining is used too frequently, the compilation time increases and too much machine code is generated. This has negative effects on the performance.Trace-based JIT compilers only compile frequently executed paths, so-called traces, instead of whole methods. This may result in faster compilation, less generated machine code, and better optimized machine code. In the previous work, we implemented a trace recording infrastructure and a trace-based compiler for JavaTM, by modifying the Java HotSpot VM. Based on this work, we evaluate the effect of trace inlining on the performance and the amount of generated machine code.Trace inlining has several major advantages when compared to method inlining. First, trace inlining is more selective than method inlining, because only frequently executed paths are inlined. Second, the recorded traces may capture information about virtual calls, which simplify inlining. A third advantage is that trace information is context sensitive so that different method parts can be inlined depending on the specific call site. These advantages allow more aggressive inlining while the amount of generated machine code is still reasonable.We evaluate several inlining heuristics on the benchmark suites DaCapo 9.12 Bach, SPECjbb2005, and SPECjvm2008 and show that our trace-based compiler achieves an up to 51% higher peak performance than the method-based Java HotSpot client compiler. Furthermore, we show that the large compilation scope of our trace-based compiler has a positive effect on other compiler optimizations such as constant folding or null check elimination.
No abstract
In several Java VMs, strings consist of two separate objects: metadata like the string length are stored in the actual string object, while the string characters are stored in a character array. This separation causes an unnecessary overhead. Each string method must access both objects, which leads to a bad cache behavior and reduces the execution speed.We propose to merge the character array with the string's metadata object at run time. This results in a new layout of strings with better cache performance, fewer field accesses, and less memory overhead. We implemented this optimization for Sun Microsystems' Java HotSpot TM VM, so that the optimization is performed automatically at run time and requires no actions on the part of the programmer. The original class String is transformed into the optimized version and the bytecodes of all methods that allocate string objects are rewritten. All these transformations are performed by the Java HotSpot TM VM when a class is loaded. Therefore, the time overhead of the transformations is negligible.Benchmarks show a reduction of the average used memory after a full garbage collection and an improved performance. The performance of the SPECjbb2005 benchmark increases by 8%, and the average used memory after a full garbage collection is reduced by 19%. The peak performance of SPECjvm98 is improved by 8% on average, with a maximum speedup of 62%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.