On-stack replacement (OSR) is a common technique employed by dynamic compilers to reduce program warm-up time. OSR allows switching from interpreted to compiled code during the execution of this code. The main targets are long running loops, which need to be represented explicitly, with dedicated information about condition and body, to be optimized at run time. Bytecode interpreters, however, represent control flow implicitly via unstructured jumps and thus do not exhibit the required high-level loop representation. To enable OSR also for jump-based-often called unstructuredlanguages, we propose the partial reconstruction of loops in order to explicitly represent them in a bytecode interpreter. Besides an outline of the general idea, we implemented our approach in Sulong, a bytecode interpreter for LLVM bitcode, which allows the execution of C/C++. We conducted an evaluation with a set of C benchmarks, which showed speed-ups in warm-up of up to 9x for certain benchmarks. This facilitates execution of programs with long-running loops in rarely called functions, which would yield significant slowdown without OSR. While shown with a prototype implementation, the overall idea of our approach is generalizable for all bytecode interpreters.
Optimizing compilers use-often hand-crafted-heuristics to control optimizations such as inlining or loop unrolling. These heuristics are based on data such as size and structure of the parts to be optimized. A compilation, however, produces much more (platform specific) data that one could use as input. We thus propose the use of machine learning (ML) to derive better optimization decisions from this wealth of data and to tackle the shortcomings of hand-crafted heuristics. Ultimately, we want to shed light on the quality and performance of optimizations by using empirical data with automated feedback and updates in a production compiler. CCS Concepts: • Computing methodologies → Machine learning; • Software and its engineering → Dynamic compilers; Just-in-time compilers.
Compiler optimization decisions are often based on hand-crafted heuristics centered around a few established benchmark suites. Alternatively, they can be learned from feature and performance data produced during compilation.However, data-driven compiler optimizations based on machine learning models require large sets of quality data for training in order to match or even outperform existing human-crafted heuristics. In static compilation setups, related work has addressed this problem with iterative compilation. However, a dynamic compiler may produce different data depending on dynamically-chosen compilation strategies, which aggravates the generation of comparable data.We propose compilation forking, a technique for generating consistent feature and performance data from arbitrary, dynamically-compiled programs. Different versions of program parts with the same profiling and compilation history are executed within single program runs to minimize noise stemming from dynamic compilation and the runtime environment.Our approach facilitates large-scale performance evaluations of compiler optimization decisions. Additionally, compilation forking supports creating domain-specific compilation strategies based on machine learning by providing the data for model training.We implemented compilation forking in the GraalVM compiler in a programming-language-agnostic way. To assess the quality of the generated data, we trained several machine learning models to replace compiler heuristics for loop-related optimizations. The trained models perform equally well to the highly-tuned compiler heuristics when comparing the geometric means of benchmark suite performances. Larger impacts on few single benchmarks range from speedups of % to slowdowns of %.The presented approach can be implemented in any dynamic compiler. We believe that it can help to analyze compilation decisions and leverage the use of machine learning into dynamic compilation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.