2005
DOI: 10.1007/s11416-005-0002-9
|View full text |Cite
|
Sign up to set email alerts
|

Malware phylogeny generation using permutations of code

Abstract: Malicious programs, such as viruses and worms, are frequently related to previous programs through evolutionary relationships. Discovering those relationships and constructing a phylogeny model is expected to be helpful for analyzing new malware and for establishing a principled naming scheme. Matching permutations of code may help build better models in cases where malware evolution does not keep things in the same order. We describe methods for constructing phylogeny models that uses features called n-perms … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
119
0

Year Published

2006
2006
2019
2019

Publication Types

Select...
4
4

Relationship

1
7

Authors

Journals

citations
Cited by 181 publications
(119 citation statements)
references
References 12 publications
0
119
0
Order By: Relevance
“…for authorship attribution, due to two factors. First, there are fewer programs per author (4-7) than in the other data sets (8)(9)(10)(11)(12)(13)(14)(15)(16), making this a fundamentally harder learning problem. More importantly, the programs in this data set do not reflect only the work of individual programmers; students in the course were often provided with substantial amounts of partially implemented skeleton code, and also worked closely with the course professor follow an often rigid specification at the sub-module level.…”
Section: Classificationmentioning
confidence: 99%
See 1 more Smart Citation
“…for authorship attribution, due to two factors. First, there are fewer programs per author (4-7) than in the other data sets (8)(9)(10)(11)(12)(13)(14)(15)(16), making this a fundamentally harder learning problem. More importantly, the programs in this data set do not reflect only the work of individual programmers; students in the course were often provided with substantial amounts of partially implemented skeleton code, and also worked closely with the course professor follow an often rigid specification at the sub-module level.…”
Section: Classificationmentioning
confidence: 99%
“…The instruction-level features we use are similar to those used in malware classification [2,8,9], particularly n-grams; our idiom features differ from features based on instruction sequences through the use of wildcards and the abstraction of low-level details like the opcode and immediate values The instruction summary colors we use in the graphlet features are inspired by a technique to identify polymorphic malware variants [11]. Although some of the binary code representations we use are similar to existing work, our techniques are largely orthogonal: malware classification seeks to extract characteristics specific to a program or a family of programs with related behavior, while our authorship attribution techniques must discover more general properties of author style.…”
Section: Related Workmentioning
confidence: 99%
“…In their proposed work, authors [34] and [35] detected morphed malware variants using a rewriting engine. Syntactic and semantic structure of variants program was analysed.…”
Section: Existing Workmentioning
confidence: 99%
“…Previous researches of malware phylogeny inference mainly focused on tree-based model [1]- [6]. Karim et al [1] used the UPGMA algorithm to generate phylogeny trees.…”
Section: Introductionmentioning
confidence: 99%
“…Karim et al [1] used the UPGMA algorithm to generate phylogeny trees. Gupta et al [6] proposed graph pruning techniques to establish phylogeny trees of malcode based on temporal informations.…”
Section: Introductionmentioning
confidence: 99%