We present a novel approach to dynamic datarace detection for multithreaded object-oriented programs. Past techniques for on-the-fly datarace detection either sacrificed precision for performance, leading to many false positive datarace reports, or maintained precision but incurred significant overheads in the range of 3 x to 30 x . In contrast, our approach results in very few false positives and runtime overhead in the 13% to 42% range, making it both efficient and precise. This performance improvement is the result of a unique combination of complementary static and dynamic optimization techniques.
Distributed enterprise applications today are increasingly being built from services available over the web. A unit of functionality in this framework is a web service, a software application that exposes a set of "typed'' connections that can be accessed over the web using standard protocols. These units can then be composed into a <i>composite</i> web service. BPEL (Business Process Execution Language) is a high-level distributed programming language for creating composite web services. Although a BPEL program invokes services distributed over several servers, the <i>orchestration</i> of these services is typically under centralized control. Because performance and throughput are major concerns in enterprise applications, it is important to remove the inefficiencies introduced by the centralized control. In a distributed, or decentralized orchestration, the BPEL program is partitioned into independent sub-programs that interact with each other without any centralized control. Decentralization can increase parallelism and reduce the amount of network traffic required for an application. This paper presents a technique to partition a composite web service written as a single BPEL program into an equivalent set of decentralized processes. It gives a new code partitioning algorithm to partition a BPEL program represented as a program dependence graph, with the goal of minimizing communication costs and maximizing the <i>throughput</i> of multiple concurrent instances of the input program. In contrast, much of the past work on dependence-based partitioning and scheduling seeks to minimize the <i>completion time</i> of a single instance of a program running in isolation. The paper also gives a cost model to estimate the throughput of a given code partition.
Existing dynamic race detectors suffer from at least one of the following three limitations: (i) space overhead per memory location grows linearly with the number of parallel threads [13], severely limiting the parallelism that the algorithm can handle; (ii) sequentialization : the parallel program must be processed in a sequential order, usually depth-first [12, 24]. This prevents the analysis from scaling with available hardware parallelism, inherently limiting its performance; (iii) inefficiency : even though race detectors with good theoretical complexity exist, they do not admit efficient implem entations and are unsuitable for practical use [4, 18]. We present a new precise dynamic race detector that leverages structured parallelism in order to address these limitations. Our algorithm requires constant space per memory location, works in parallel, and is efficient in practice. We implemented and evaluated our algorithm on a set of 15 benchmarks. Our experimental results indicate an average (geometric mean) slowdown of 2.78x on a 16-core SMP system.
The Factored Control Flow Graph, FCFG, is a novel representation of a program's intraprocedural control flow, which is designed to efficiently support the analysis of programs written in languages, such as Java, that have frequently occurring operations whose execution may result in exceptional control flow. The FCFG is more compact than traditional CFG representations for exceptional control flow, yet there is no loss of precision in using the FCFG. In this paper, we introduce the FCFG representation and outline how standard forward and backward data flow analysis algorithms can be adapted to work on this representation. We also present empirical measurements of FCFG sizes for a large number of methods obtained from a variety of Java programs, and compare these sizes with those of a traditional CFG representation.
This paper describes a general framework for representing iteration-reordering transformations. These transformations can be both matrix-based and non-matrix-based. Transformations are defined by rules for mapping dependence vectors, rules for mapping loop bound expressions, and rules for creating new initialization statements. The framework is extensible, and can be used to represent any iteration-reordering transformation. Mapping rules for several common transformations are included in the paper.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with đź’™ for researchers
Part of the Research Solutions Family.