With the ubiquity of parallel commodity hardware, developers turn to high-level concurrency models such as the actor model to lower the complexity of concurrent software. However, debugging concurrent software is hard, especially for concurrency models with a limited set of supporting tools. Such tools often deal only with the underlying threads and locks, which obscures the view on e.g. actors and messages and thereby introduces additional complexity.To improve on this situation, we present a low-overhead record & replay approach for actor languages. It allows one to debug concurrency issues deterministically based on a previously recorded trace. Our evaluation shows that the average run-time overhead for tracing on benchmarks from the Savina suite is 10% (min. 0%, max. 20%). For Acme-Air, a modern web application, we see a maximum increase of 1% in latency for HTTP requests and about 1.4 MB/s of trace data. These results are a first step towards deterministic replay debugging of actor systems in production.
Today's complex software systems combine high-level concurrency models. Each model is used to solve a specific set of problems. Unfortunately, debuggers support only the lowlevel notions of threads and shared memory, forcing developers to reason about these notions instead of the high-level concurrency models they chose.This paper proposes a concurrency-agnostic debugger protocol that decouples the debugger from the concurrency models employed by the target application. As a result, the underlying language runtime can define custom breakpoints, stepping operations, and execution events for each concurrency model it supports, and a debugger can expose them without having to be specifically adapted.We evaluated the generality of the protocol by applying it to SOMns, a Newspeak implementation, which supports a diversity of concurrency models including communicating sequential processes, communicating event loops, threads and locks, fork/join parallelism, and software transactional memory. We implemented 21 breakpoints and 20 stepping operations for these concurrency models. For none of these, the debugger needed to be changed. Furthermore, we visualize all concurrent interactions independently of a specific concurrency model. To show that tooling for a specific concurrency model is possible, we visualize actor turns and message sends separately. This paper presents the Kómpos protocol, a concurrencyagnostic protocol to enable debuggers to support a wide range of concurrency models. Using the Kómpos protocol, we built the Kómpos debugger for online debugging of complex concurrent systems that combine communicating event loops (CEL) [Miller et al. 2005], communicating sequential processes (CSP) [Hoare 1978], software transactional memory (STM) [Harris et al. 2005], fork/join [Blumofe et al. 1995], and shared-memory threads and locks. Based on the concurrencyagnostic protocol, Kómpos supports a rich set of breakpoints for the various concurrency abstractions, a rich set of stepping semantics to explore program behavior, a generic visualization of interactions between concurrent entities, as well as an actor-specific visualization of turns and message sends. This evaluation shows that the Kómpos protocol is (1) general enough to support advanced debugger features for shared-memory and message-passing models, and (2) that it supports tools using the provided data independently of any concurrency model, while preserving the ability to build tools specific to a single model.Kómpos is a debugger for SOMns, a Newspeak implementation [Bracha et al. 2010] based on Truffle [Würthinger et al. 2012]. SOMns' debugger support is built on Truffle's tooling and debugger features [Seaton et al. 2014;Van De Vanter 2015]. SOMns supports the five aforementioned concurrency models and implements breakpoints, stepping, and a tracing mechanism for them. Because of the concurrency-agnostic design of the Kómpos protocol, the Kómpos debugger is independent from these concurrency models.The contributions of this paper are:
Today's complex software systems combine high-level concurrency models. Each model is used to solve a specific set of problems. Unfortunately, debuggers support only the lowlevel notions of threads and shared memory, forcing developers to reason about these notions instead of the high-level concurrency models they chose.This paper proposes a concurrency-agnostic debugger protocol that decouples the debugger from the concurrency models employed by the target application. As a result, the underlying language runtime can define custom breakpoints, stepping operations, and execution events for each concurrency model it supports, and a debugger can expose them without having to be specifically adapted.We evaluated the generality of the protocol by applying it to SOMns, a Newspeak implementation, which supports a diversity of concurrency models including communicating sequential processes, communicating event loops, threads and locks, fork/join parallelism, and software transactional memory. We implemented 21 breakpoints and 20 stepping operations for these concurrency models. For none of these, the debugger needed to be changed. Furthermore, we visualize all concurrent interactions independently of a specific concurrency model. To show that tooling for a specific concurrency model is possible, we visualize actor turns and message sends separately.
With concurrency being integral to most software systems, developers combine high-level concurrency models in the same application to tackle each problem with appropriate abstractions. While languages and libraries offer a wide range of concurrency models, debugging support for applications that combine them has not yet gained much attention. Record & replay aids debugging by deterministically reproducing recorded bugs, but is typically designed for a single concurrency model only.This paper proposes a practical concurrency-model-agnostic record & replay approach for multi-paradigm concurrent programs, i. e. applications that combine concurrency models. Our approach traces high-level nondeterministic events by using a uniform model-agnostic trace format and infrastructure. This enables orderingbased record & replay support for a wide range of concurrency models, and thereby enables debugging of applications that combine them. In addition, it allows language implementors to add new concurrency models and reuse the model-agnostic record & replay support.We argue that a concurrency-model-agnostic record & replay is practical and enables advanced debugging support for a wide range of concurrency models. The evaluation shows that our approach is expressive and flexible enough to support record & replay of applications using threads & locks, communicating event loops, communicating sequential processes, software transactional memory and combinations of those concurrency models. For the actor model, we reach recording performance competitive with an optimized special-purpose record & replay solution. The average recording overhead on the Savina actor benchmark suite is 10 % (min. 0 %, max. 23 %). The performance for other concurrency models and combinations thereof is at a similar level.We believe our concurrency-model-agnostic approach helps developers of applications that mix and match concurrency models. We hope that this substrate inspires new tools and languages making building and maintaining of multi-paradigm concurrent applications simpler and safer. ACM CCS 2012Computing methodologies → Concurrent programming languages; Software and its engineering → Software maintenance tools;
The actor model is popular for many types of server applications. Efficient snapshotting of applications is crucial in the deployment of pre-initialized applications or moving running applications to different machines, e.g for debugging purposes. A key issue is that snapshotting blocks all other operations. In modern latency-sensitive applications, stopping the application to persist its state needs to be avoided, because users may not tolerate the increased request latency.In order to minimize the impact of snapshotting on request latency, our approach persists the application's state asynchronously by capturing partial heaps, completing snapshots step by step. Additionally, our solution is transparent and supports arbitrary object graphs.We prototyped our snapshotting approach on top of the Truffle/Graal platform and evaluated it with the Savina benchmarks and the Acme Air microservice application. When performing a snapshot every thousand Acme Air requests, the number of slow requests ( 0.007% of all requests) with latency above 100ms increases by 5.43%. Our Savina microbenchmark results detail how different utilization patterns impact snapshotting cost.To the best of our knowledge, this is the first system that enables asynchronous snapshotting of actor applications, i.e. without stop-the-world synchronization, and thereby minimizes the impact on latency. We thus believe it enables new deployment and debugging options for actor systems.CCS Concepts • Software and its engineering → Software testing and debugging; • Theory of computation → Concurrency.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.