Provable GPU Data-Races in Static Race Detection

Motivated by the increasing imporantce of general-purpose GPU (GPGPU) programming, exemplified by NVIDIA’s CUDA framework, as well as the difficulty, especially for novice programmers, of reasoning about performance in GPGPU kernels, we introduce a novel quantitative program logic for CUDA kernels. The logic allows programmers to reason about both functional correctness and resource usage of CUDA kernels, paying particular attention to a set of common but CUDA-specific performance bottlenecks: warp divergences, uncoalesced memory accesses, and bank conflicts. The logic is proved sound with respect to a novel operational cost semantics for CUDA kernels. The semantics, logic and soundness proofs are formalized in Coq. An inference algorithm based on LP solving automatically synthesizes symbolic resource bounds by generating derivations in the logic. This algorithm is the basis of RaCUDA, an end-to-end resource-analysis tool for kernels, which has been implemented using an existing resource-analysis tool for imperative programs. An experimental evaluation on a suite of benchmarks shows that the analysis is effective in aiding the detection of performance bugs in CUDA kernels.

show abstract

Sound and Partially-Complete Static Analysis of Data-Races in GPU Programs

Liew,

Cogumbreiro,

Lange

2024

Proc. ACM Program. Lang.

View full text Add to dashboard Cite

GPUs are progressively being integrated into modern society, playing a pivotal role in Artificial Intelligence and High-Performance Computing. Programmers need a deep understanding of the GPU programming model to avoid subtle data-races in their codes. Static verification that is sound and incomplete can guarantee data-race freedom, but the alarms it raises may be spurious and need to be validated. In this paper, we establish a True Positive Theorem for a static data-race detector for GPU programs, i.e., a result that identifies a class of programs for which our technique only raises true alarms. Our work builds on the formalism of memory access protocols, that models the concurrency operations of CUDA programs. The crux of our approach is an approximation analysis that can correctly identify true alarms, and pinpoint the conditions that make an alarm imprecise. Our approximation analysis detects when the reported locations are reachable (control independence, or CI), and when the reported locations are precise (data independence, or DI), as well identify inexact values in an alarm. In addition to a True Positive result for programs that are CI and DI, we establish the root causes of spurious alarms depending on whether CI or DI are present. We apply our theory to introduce FaialAA, the first sound and partially complete data-race detector. We evaluate FaialAA in three experiments. First, in a comparative study with the state-of-the-art tools, we show that FaialAA confirms more DRF programs than others while emitting 1.9× fewer potential alarms. Importantly, the approximation analysis of FaialAA detects 10 undocumented data-races. Second, in an experiment studying 6 commits of data-race fixes in open source projects OpenMM and Nvidia’s MegaTron, FaialAA confirmed the buggy and fixed versions of 5 commits, while others were only able to confirm 2. Third, we show that 59.5% of 2,770 programs are CI and DI, quantifying when the approximation analysis of FaialAA is complete. This paper is accompanied by the mechanized proofs of the theoretical results presented therein and a tool (FaialAA) implementing of our theory.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Provable GPU Data-Races in Static Race Detection

Cited by 3 publications

References 19 publications

Memory access protocols: certified data-race freedom for GPU kernels

Memory access protocols: certified data-race freedom for GPU kernels

Modeling and Analyzing Evaluation Cost of CUDA Kernels

Sound and Partially-Complete Static Analysis of Data-Races in GPU Programs

Contact Info

Product

Resources

About