Donghoon LEE†a) , Dongwoo KANG †b) , Younsung CHOI † †c) , Jiye KIM †d) , Nonmembers, and Dongho WON †e) , Member SUMMARY The software birthmarking technique has conventionally been studied in fields such as software piracy, code theft, and copyright infringement. The most recent API-based software birthmarking method (Han et al., 2014) extracts API call sequences in entire code sections of a program. Additionally, it is generated as a birthmark using a cryptographic hash function (MD5). It was reported that different application types can be categorized in a program through prefiltering based on DLL/API numbers/names. However, similarity cannot be measured owing to the cryptographic hash function, occurrence of false negatives, and it is difficult to functionally categorize applications using only DLL/API numbers/names. In this paper, we propose an API-based software birthmarking method using fuzzy hashing. For the native code of a program, our software birthmarking technique extracts API call sequences in the segmented procedures and then generates them using a fuzzy hash function. Unlike the conventional cryptographic hash function, the fuzzy hash is used for the similarity measurement of data. Our method using a fuzzy hash function achieved a high reduction ratio (about 41% on average) more than an original birthmark that is generated with only the API call sequences.In our experiments, when threshold ε is 0.35, the results show that our method is an effective birthmarking system to measure similarities of the software. Moreover, our correlation analysis with top 50 API call frequencies proves that it is difficult to functionally categorize applications using only DLL/API numbers/names. Compared to prior work, our method significantly improves the properties of resilience and credibility. key words: software birthmark, birthmarking systems, software similarity, fuzzy hash, API-based sequences