A static API birthmark for Windows binary executables

Choi, Seokwoo; Park, Heewan; Lim, Hyeong-Seok; Han, Tian

doi:10.1016/j.jss.2008.11.848

Cited by 45 publications

(42 citation statements)

References 28 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In addition, Park et al proposed a static API-call-based birthmark for software theft detection of Java applications [26]. Choi et al additionally presented a static API birthmark for Windows execution files using a set of API calls identified as being static by a disassembler [27]. In addition to the above static birthmarkgeneration techniques, several dynamic API-based birthmarks have been proposed.…”

Section: Api-based Birthmarksmentioning

confidence: 99%

API-Based Software Birthmarking Method Using Fuzzy Hashing

Lee

Kang

Choi

et al. 2016

IEICE Trans. Inf. & Syst.

View full text Add to dashboard Cite

Donghoon LEE†a) , Dongwoo KANG †b) , Younsung CHOI † †c) , Jiye KIM †d) , Nonmembers, and Dongho WON †e) , Member SUMMARY The software birthmarking technique has conventionally been studied in fields such as software piracy, code theft, and copyright infringement. The most recent API-based software birthmarking method (Han et al., 2014) extracts API call sequences in entire code sections of a program. Additionally, it is generated as a birthmark using a cryptographic hash function (MD5). It was reported that different application types can be categorized in a program through prefiltering based on DLL/API numbers/names. However, similarity cannot be measured owing to the cryptographic hash function, occurrence of false negatives, and it is difficult to functionally categorize applications using only DLL/API numbers/names. In this paper, we propose an API-based software birthmarking method using fuzzy hashing. For the native code of a program, our software birthmarking technique extracts API call sequences in the segmented procedures and then generates them using a fuzzy hash function. Unlike the conventional cryptographic hash function, the fuzzy hash is used for the similarity measurement of data. Our method using a fuzzy hash function achieved a high reduction ratio (about 41% on average) more than an original birthmark that is generated with only the API call sequences.In our experiments, when threshold ε is 0.35, the results show that our method is an effective birthmarking system to measure similarities of the software. Moreover, our correlation analysis with top 50 API call frequencies proves that it is difficult to functionally categorize applications using only DLL/API numbers/names. Compared to prior work, our method significantly improves the properties of resilience and credibility. key words: software birthmark, birthmarking systems, software similarity, fuzzy hash, API-based sequences

show abstract

Section: Api-based Birthmarksmentioning

confidence: 99%

API-Based Software Birthmarking Method Using Fuzzy Hashing

Lee

Kang

Choi

et al. 2016

IEICE Trans. Inf. & Syst.

View full text Add to dashboard Cite

show abstract

“…Unlike software watermarks, there is no need for prior information embedding; features unique to the program are taken from the compiled binary and defined as the birthmark. Several different types of birthmark that focus on a different program features have been proposed [3][4][5][6][7][8].…”

Section: Related Workmentioning

confidence: 99%

“…The extracted elements are maintained in a structure such as a set, sequence or graph, and the similarity computation method is defined in accordance with the structure used. For example, the Jaccard index [7], Dice's coefficient [4] or cosine similarity [6] are often used for sets, and the longest common subsequence [5,8] is used for sequences. Methods for graphs are more complex; however, methods using graph isomorphism have been proposed [3,9].…”

Section: Related Workmentioning

confidence: 99%

Static Software Birthmark based on Multiple Attributes

Shi¹

2017

Proceedings of the 2017 International Conference on Mechanical, Electronic, Control and Automation Engineering (MECAE 2017)

View full text Add to dashboard Cite

Abstract. Software birthmarks have been proposed as a method for enabling the detection of programs that may have been stolen by measuring the similarity between the two programs. A birthmark is created from each program by extracting its native characteristics. The birthmarks of the programs can then be compared. However, the existing software birthmark is mainly software birthmark single attribute, they will play a good effect in the specific scene, but poor robustness in general single attribute software. For a single attribute of software birthmark has poor robustness, resistance to attack ability is not strong, multi attribute extraction software birthmark from point of view, this paper firstly proposed Java software related attribute analysis of hierarchical structure, on the basis of a multi attribute static software birthmark.

show abstract

“…But these approaches [7,8] ignore the frequency of API calls in the sequences and suffer from the same problem as normal signature approaches and become similar to signature based approach resulting in a more false positives outcome [9]. Windows Application Program Interface (API) function calls [10][11][12]10] have been used in statistical N-gram modeling techniques [11,12] for detection. However these approaches [11,12] use simple wrapper classification methods [13] which did not explore the ways of selecting the best set of APIs from a large set of APIs.…”

Section: Introductionmentioning

confidence: 99%

Hybrids of support vector machine wrapper and filter based framework for malware detection

Huda

Abawajy

Alazab

et al. 2016

Future Generation Computer Systems

108

View full text Add to dashboard Cite

h i g h l i g h t s• A signature-free malware detection approach has been proposed. • A hybrid wrapper-Filter based malware feature selection has been proposed. • Proposed hybrid approach can take advantages from both filter and wrapper. • Models have also been validated by statistical model selection criteria such as Chi Square and Akaike information criterion (AIC). a b s t r a c tMalware replicates itself and produces offspring with the same characteristics but different signatures by using code obfuscation techniques. Current generation Anti-Virus (AV) engines employ a signaturetemplate type detection approach where malware can easily evade existing signatures in the database. This reduces the capability of current AV engines in detecting malware. In this paper we propose a hybrid framework for malware detection by using the hybrids of Support Vector Machines Wrapper, MaximumRelevance-Minimum-Redundancy Filter heuristics where Application Program Interface (API) call statistics are used as a malware features. The novelty of our hybrid framework is that it injects the filter's ranking score in the wrapper selection process and combines the properties of both wrapper and filters and API call statistics which can detect malware based on the nature of infectious actions instead of signature. To the best of our knowledge, this kind of hybrid approach has not been explored yet in the literature in the context of feature selection and malware detection. Knowledge about the intrinsic characteristics of malicious activities is determined by the API call statistics which is injected as a filter score into the wrapper's backward elimination process in order to find the most significant APIs. While using the most significant APIs in the wrapper classification on both obfuscated and benign types malware datasets, the results show that the proposed hybrid framework clearly surpasses the existing models including the independent filters and wrappers using only a very compact set of significant APIs. The performances of the proposed and existing models have further been compared using binary logistic regression. Various goodness of fit comparison criteria such as Chi Square, Akaike's Information Criterion (AIC) and Receiver Operating Characteristic Curve ROC are deployed to identify the best performing models. Experimental outcomes based on the above criteria also show that the proposed hybrid framework outperforms other existing models of signature types including independent wrapper and filter approaches to identify malware.

show abstract

A static API birthmark for Windows binary executables

Cited by 45 publications

References 28 publications

API-Based Software Birthmarking Method Using Fuzzy Hashing

API-Based Software Birthmarking Method Using Fuzzy Hashing

Static Software Birthmark based on Multiple Attributes

Hybrids of support vector machine wrapper and filter based framework for malware detection

Contact Info

Product

Resources

About