International audienceJava is one of the most popular programming languages in today's software development, but the adoption of Java in some areas like high performance computing, gaming, and media processing is not as universal as in general-purpose computing. A major drawback preventing it from being extensively adopted in those areas is its lower performance than the traditional or domain-specific languages. This paper describes two approaches to improve Java's usability in those areas by introducing vector processing capability to Java. The first approach is to provide a Java vectorization interface (JVI) that developers can program with, to explicitly expose the programs' data parallelism. The other approach is to use automatic vectorization to generate vector instructions for Java programs. It does not require programmers to modify the original source code. We evaluate the two vectorization approaches with SPECjvm2008 benchmark. The performances of scimark.fft and scimark.lu are improved up to 55% and 107% respectively when running in single thread. We also investigate some factors that impact the vectorization effects, including the memory bus bandwidth and the superscalar micro-architecture
Abstract. Compacting garbage collector (GC) is widely used due to its good properties of in-place collection and heap de-fragmentation. In addition, it supports fast bump-pointer allocation and provides good access locality. Most known commercial JVM or CLR implementations use compaction algorithm in certain garbage collection scenarios, such as in full heap or mature object space collections. LISP2 compactor is one of the best-known GC algorithms. As multi-core architecture prevails, several efficient parallel compactors have been proposed. Nevertheless, there is no parallel LISP2 compactor available that can preserve all the sliding properties of its sequential counterpart. That is, to compact live data in-place into a single contiguous region in one end of the heap while maintaining the original object order. In this paper, we propose a fully parallel LISP2 compactor that keeps all the sliding properties. We also prove the correctness of the design. This parallel LISP2 compactor is fully parallel because all of its four phases are parallelized and the workloads are well balanced among the collector threads. The compactor supports fall-back compaction and adjustable boundaries that help deliver the best performance. We have implemented the parallel LISP2 compactor in Apache Harmony, a product-quality open source Java SE implementation. We evaluate and discuss the design on an Intel 8-core platform with representative benchmark.
As multithreaded server applications and runtime systems prevail, garbage collection is becoming an essential feature to support high performance systems, especially those running data-intensive applications. The fundamental issue of garbage collector (GC) design is to maximize the recycled space with minimal time overhead. This paper proposes two innovative solutions: one to improve space efficiency, and the other to improve time efficiency. To achieve space efficiency, we propose the Space Tuner that utilizes the novel concept of allocation speed to reduce wasted space. Conventional static space partitioning techniques often lead to inefficient space utilization. The Space Tuner adjusts the heap partitioning dynamically such that when a collection is triggered, all space partitions are fully filled. To achieve time efficiency, we propose a novel parallelization method that reduces the compacting GC parallelization problem into a tree traversal parallelization problem. This method can be applied for both normal and large object compaction. Object compaction is hard to parallelize due to strong data dependencies such that the source object can not be moved to its target location until the object originally in the target location has been moved out. Our proposed algorithm overcomes the difficulties by dividing the heap into equal-sized blocks and parallelizing the movement of the independent blocks. It Parallel Prog (2011) 39:451-472 is noteworthy that these proposed algorithms are generic such that they can be utilized in different GC designs. The proposed techniques have been implemented in Apache Harmony JVM and we evaluated the proposed algorithms with SPECjbb and Dacapo benchmark suites. The experiment results demonstrate that our proposed algorithms greatly improve space utilization and the corresponding parallelization schemes are scalable, which brings time efficiency.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.