2014
DOI: 10.1016/j.jare.2014.01.001
|View full text |Cite
|
Sign up to set email alerts
|

Unsupervised, low latency anomaly detection of algorithmically generated domain names by generative probabilistic modeling

Abstract: We propose a method for detecting anomalous domain names, with focus on algorithmically generated domain names which are frequently associated with malicious activities such as fast flux service networks, particularly for bot networks (or botnets), malware, and phishing. Our method is based on learning a (null hypothesis) probability model based on a large set of domain names that have been white listed by some reliable authority. Since these names are mostly assigned by humans, they are pronounceable, and ten… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
13
0

Year Published

2018
2018
2020
2020

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 23 publications
(13 citation statements)
references
References 15 publications
0
13
0
Order By: Relevance
“…Bilge et al (2014) proposed the EXPOSURE system by extracting 15 domain name features, used the J48 decision tree for classification. Raghuram et al (2014) built a detection model on normal domain names for rapid identification of abnormal domain names, (Grill et al 2015) studied a method only through NetFlow information over DNS traffic rather than domain names, (Wang and Shirley 2015) proposed using word segmentation to derive tokens from domain names to detect DGA domain names with features like the number of characters and digits. But in the actual network, features are difficult to extract and collect.…”
Section: Dga Detection Methodsmentioning
confidence: 99%
“…Bilge et al (2014) proposed the EXPOSURE system by extracting 15 domain name features, used the J48 decision tree for classification. Raghuram et al (2014) built a detection model on normal domain names for rapid identification of abnormal domain names, (Grill et al 2015) studied a method only through NetFlow information over DNS traffic rather than domain names, (Wang and Shirley 2015) proposed using word segmentation to derive tokens from domain names to detect DGA domain names with features like the number of characters and digits. But in the actual network, features are difficult to extract and collect.…”
Section: Dga Detection Methodsmentioning
confidence: 99%
“…Raghuram et al. [ 21 ] proposed a generative model by analyzing the probability distribution of characters, words, word lengths, and number of words in human generated domain names. These models require the manual construction of feature sets by users with rich feature-engineering experience.…”
Section: Related Workmentioning
confidence: 99%
“…In [12], a detection scheme based on the length distribution of DNS request domain name was proposed, which can be used for detecting unknown DGAs. In [13], a detection model on normal DNS domain names for recognizing abnormal domain names was established, it uses natural language processing (NLP) to analyze the character features. In [14], the method based on network flow information over DNS traffic rather than domain names was proposed, but it is limited by the difficulty of collecting the flow information in large-scale networks.…”
Section: Related Workmentioning
confidence: 99%
“…In actual fact, the methods in [12][13][14][15][16] are all limited by the status of the network environment and data integrity. In real networks, especially in large-scale networks, these traffic features are very difficult to collect.…”
Section: Related Workmentioning
confidence: 99%