Concurrency Debugging with Differential Schedule Projections

Machado, Nuno; Quinta, Daniel; Lucia, Brandon; Rodrigues, L.

doi:10.1145/2885495

Cited by 11 publications

(26 citation statements)

References 81 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Similar to our approach, [Huang et al 2013] and [Machado et al 2015] rely on symbolic constraint solving to construct full, failing, multithreaded schedules that manifest concurrency bugs. However, unlike these approaches which depend on dynamic path profiling to detect conflicting operations, we analyze programs in an intermediate language which can be generated from any source language using a proper compiler.…”

Section: Related Workmentioning

confidence: 99%

CLOTHO: directed test generation for weakly consistent database systems

Rahmani

Nagar

Delaware

et al. 2019

Proc. ACM Program. Lang.

View full text Add to dashboard Cite

Relational database applications are notoriously difficult to test and debug. Concurrent execution of database transactions may violate complex structural invariants that constraint how changes to the contents of one (shared) table affect the contents of another. Simplifying the underlying concurrency model is one way to ameliorate the difficulty of understanding how concurrent accesses and updates can affect database state with respect to these sophisticated properties. Enforcing serializable execution of all transactions achieves this simplification, but it comes at a significant price in performance, especially at scale, where database state is often replicated to improve latency and availability.To address these challenges, this paper presents a novel testing framework for detecting serializability violations in (SQL) database-backed Java applications executing on weakly-consistent storage systems. We manifest our approach in a tool named clotho, that combines a static analyzer and a model checker to generate abstract executions, discover serializability violations in these executions, and translate them back into concrete test inputs suitable for deployment in a test environment. To the best of our knowledge, clotho is the first automated test generation facility for identifying serializability anomalies of Java applications intended to operate in geo-replicated distributed environments. An experimental evaluation on a set of industry-standard benchmarks demonstrates the utility of our approach.

show abstract

Section: Related Workmentioning

confidence: 99%

CLOTHO: directed test generation for weakly consistent database systems

Rahmani

Nagar

Delaware

et al. 2019

Proc. ACM Program. Lang.

View full text Add to dashboard Cite

show abstract

“…This paper also discusses a number of challenging research directions, including not only the design of additional partial logging schemes but also new heuristics and metrics to correlate partial logs. Finally, CoopREP can also be easily integrated with bug localization tools, [31,32] to automatically identify nonvisible bugs and isolate the error's root cause.…”

Section: Discussionmentioning

confidence: 99%

CoopREP: Cooperative record and replay of concurrency bugs

Machado

Romano

Rodrigues

2017

Software Testing Verif & Rel

Self Cite

View full text Add to dashboard Cite

SummaryThis paper presents CoopREP, a system that provides support for fault replication of concurrent programs based on cooperative recording and partial log combination. CoopREP uses partial logging to reduce the amount of information that a given program instance is required to store to support deterministic replay. This allows reducing substantially the overhead imposed by the instrumentation of the code, but raises the problem of finding a combination of logs capable of replaying the fault. CoopREP tackles this issue by introducing several innovative statistical analysis techniques aimed at guiding the search of the partial logs to be combined and needed for the replay phase. CoopREP has been evaluated using both standard benchmarks for multithreaded applications and real-world applications. The results highlight that CoopREP can successfully replay concurrency bugs involving tens of thousands of memory accesses, while reducing recording overhead with respect to state-of-the-art noncooperative logging schemes by up to 13× (and by 2.4× on average). KEYWORDSconcurrency errors, debugging, partial logging, record and replay INTRODUCTIONConcurrent programming is of paramount importance to exploit the full potential of the emerging multicore architectures. However, writing and debugging concurrent programs is notoriously difficult. Contrary to most bugs in sequential programs, which usually depend exclusively on the program input and the execution environment (and, therefore, can be more easily reproduced), concurrency bugs depend on nondeterministic interactions among threads. This means that even when re-executing the same code with identical inputs, on the same machine, the outcome of the program may differ from run to run [1].Deterministic replay (or record and replay) addresses this issue by recording nondeterministic events (such as the order of access to shared-memory locations) during a failing execution and, then, use the resulting trace to support the reproduction of the error [2]. Classic approaches, [3-6] also referred to as order-based, trace the relative order of all relevant events, thus allowing to replay the bug at the first attempt. Unfortunately, they also come with an excessively high recording cost (10×-100× slowdown), which is impractical for most settings.Motivated by the observation that the most significant performance constraints are on production runs, more recent solutions have adopted a search-based approach [1,7-9]. Search-based solutions reduce the recording overhead at the cost of a longer reproduction time during diagnosis.To this end, they typically log incomplete information at runtime and rely on post-recording exploration techniques to complete the missing data.These techniques explore various trade-offs between recording overhead and replay efficacy (ie, number of replay attempts required to reproduce the bug).The work in this paper aims at further reducing the overhead achievable using either order-based or search-based techniques, by devising cooperative logging schemes tha...

show abstract

“…In order to further confirm how race conditions affect replay, we also evaluated a synthetic racy program-Crasher [51]. We ran Crasher 100, 000 times, and the race condition (causing a crash) was observed on 82, 592 out of 100, 000 executions.…”

Section: Handling Race Conditionsmentioning

confidence: 99%

“…We have performed experiments on Memcached, Crasher [51], and all evaluated PARSEC applications using implanted buffer overflows. Crasher contains a segmentation fault, while the others have buffer overflows.…”

Section: Debugging Toolsmentioning

confidence: 99%

iReplayer: in-situ and identical record-and-replay for multithreaded applications

Liu

Silvestro

Wang

et al. 2018

Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation

View full text Add to dashboard Cite

Reproducing executions of multithreaded programs is very challenging due to many intrinsic and external non-deterministic factors. Existing RnR systems achieve significant progress in terms of performance overhead, but none targets the in-situ setting, in which replay occurs within the same process as the recording process. Also, most existing work cannot achieve identical replay, which may prevent the reproduction of some errors. This paper presents iReplayer, which aims to identically replay multithreaded programs in the original process (under the "in-situ" setting). The novel in-situ and identical replay of iReplayer makes it more likely to reproduce errors, and allows it to directly employ debugging mechanisms (e.g. watchpoints) to aid failure diagnosis. Currently, iReplayer only incurs 3% performance overhead on average, which allows it to be always enabled in the production environment. iReplayer enables a range of possibilities, and this paper presents three examples: two automatic tools for detecting buffer overflows and use-after-free bugs, and one interactive debugging tool that is integrated with GDB.

show abstract

Concurrency Debugging with Differential Schedule Projections

Cited by 11 publications

References 81 publications

CLOTHO: directed test generation for weakly consistent database systems

CLOTHO: directed test generation for weakly consistent database systems

CoopREP: Cooperative record and replay of concurrency bugs

iReplayer: in-situ and identical record-and-replay for multithreaded applications

Contact Info

Product

Resources

About