Large software projects contain significant code duplication, mainly due to copying and pasting code. Many techniques have been developed to identify duplicated code to enable applications such as refactoring, detecting bugs, and protecting intellectual property. Because source code is often unavailable, especially for third-party software, finding duplicated code in binaries becomes particularly important. However, existing techniques operate primarily on source code, and no effective tool exists for binaries.In this paper, we describe the first practical clone detection algorithm for binary executables. Our algorithm extends an existing tree similarity framework based on clustering of characteristic vectors of labeled trees with novel techniques to normalize assembly instructions and to accurately and compactly model their structural information. We have implemented our technique and evaluated it on Windows XP system binaries totaling over 50 million assembly instructions. Results show that it is both scalable and precise: it analyzed Windows XP system binaries in a few hours and produced few false positives. We believe our technique is a practical, enabling technology for many applications dealing with binary code. *
Software is among the most complex human artifacts, and visualization is widely acknowledged as important to understanding software. In this paper, we consider the problem of understanding a software system's architecture through visualization. Whereas traditional visualizations use multiple stakeholder-specific views to present different kinds of taskspecific information, we propose an additional visualization technique that unifies the presentation of various kinds of architecture-level information, thereby allowing a variety of stakeholders to quickly see and communicate current development, quality, and costs of a software system. For future empirical evaluation of multi-aspect, single-view architectural visualizations, we have implemented our idea in an existing visualization tool, Vizz3D. Our implementation includes techniques, such as the use of a city metaphor, that reduce visual complexity in order to support single-view visualizations of large-scale programs.
We present initial work on perturbation techniques that cause the manifestation of timing-related bugs in distributed memory Message Passing Interface (MPI)-based applications. These techniques improve the coverage of possible message orderings in MPI applications that rely on nondeterministic point-to-point communication and work with small processor counts to alleviate the need to test at larger scales. Using carefully designed model problems, we show that these techniques aid testing for problems that are often not easily reproduced when running on small fractions of the machine.Our perturbation layer, JitterBug, builds on P N MPI, an extension of the MPI profiling interface that supports multiple layers of profiling libraries. We discuss how JitterBug complements existing MPI checking tools through the P N MPI framework. We present opportunities to build additional tools that statically analyze and directly transform the source code to support testing and debugging MPI applications at reduced scale.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.