Proceedings of the 8th Working Conference on Mining Software Repositories 2011
DOI: 10.1145/1985441.1985453
|View full text |Cite
|
Sign up to set email alerts
|

Finding software license violations through binary code clone detection

Abstract: Software released in binary form frequently uses third-party packages without respecting their licensing terms. For instance, many consumer devices have firmware containing the Linux kernel, without the suppliers following the requirements of the GNU General Public License. Such license violations are often accidental, e.g., when vendors receive binary code from their suppliers with no indication of its provenance. To help find such violations, we have developed the Binary Analysis Tool (BAT), a system for cod… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
68
0

Year Published

2012
2012
2023
2023

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 115 publications
(68 citation statements)
references
References 20 publications
0
68
0
Order By: Relevance
“…Hemel et al [27] introduced BAT, a tool that detects code cloning in binaries. They implemented three binary clone detection techniques.…”
Section: License Violationsmentioning
confidence: 99%
“…Hemel et al [27] introduced BAT, a tool that detects code cloning in binaries. They implemented three binary clone detection techniques.…”
Section: License Violationsmentioning
confidence: 99%
“…When confronted with the presence of code cloning within a software system, a manager will want to understand why the decision was made to do so, and what the perceived short-and long-term trade-offs are. Cloning across different projects is more difficult to manage; of special concern is the possibility of code from an open source project being inserted into a closed source system [4,2], which may open up the company to legal action, such as was alleged in the recent lawsuit involving Oracle and Google [18].…”
Section: The Problem Of Software Development Provenancementioning
confidence: 99%
“…For example, when a manager finds a section of code that is involved in an unusually large number of problems, he may wish to find out how old the code in question is, which developers have been working on it recently, which change sets it has been involved in, and which features, quality attributes, and bug fixes it implements. When a IP lawyer is concerned about a claim of open source licensing violations within a closed source system, she may wish to be able to track the design history of the code in question [4]. When a software architect is trying to understand the current design of a module within a software system, he may wish to be able to compare the change history against discussions in the developer mailing list archives [5].…”
Section: Introductionmentioning
confidence: 99%
“…In contrast, since the goal of Rendezvous is to do binary clone matching across different code bases, we needed to address the compiler optimisation problem and we believe our technique to be sufficiently accurate to be successful. Hemel et al [18] looked purely at strings in the binary to uncover code violating the GNU public license. The advantage of their technique was that it eliminated the need to perform disassembly.…”
Section: Related Workmentioning
confidence: 99%