2015 International Conference on High Performance Computing &Amp; Simulation (HPCS) 2015
DOI: 10.1109/hpcsim.2015.7237040
|View full text |Cite
|
Sign up to set email alerts
|

A lexical approach for classifying malicious URLs

Abstract: Given the continuous growth of illicit activities on the Internet, there is a need for intelligent systems to identify malicious web pages. It has been shown that URL analysis is an e↵ective tool for detecting phishing, malware, and other attacks. Previous studies have performed URL classification using a combination of lexical features, network tra c, hosting information, and other strategies. These approaches require time-intensive lookups which introduce significant delay in real-time systems. This paper de… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
43
0
2

Year Published

2019
2019
2021
2021

Publication Types

Select...
3
3

Relationship

0
6

Authors

Journals

citations
Cited by 52 publications
(45 citation statements)
references
References 18 publications
0
43
0
2
Order By: Relevance
“…of benign to malicious URLs, whereas in [28], [45], [49], [67], [77], [80] researchers evaluated their classifiers on highly unbalanced datasets (the benign dataset size is almost 50 times or more) but fixed ratios. Papers [35], [47], [49], [55], [69] report moderate benign-to-malicious data ratios (e.g., 2:1, 3:2, 5:1, etc.). Authors in [30], [43], [46], [54], [81]- [84] used slightly PhishTank (pub.)…”
Section: ) Dataset Sources and Availability: Inmentioning
confidence: 99%
See 4 more Smart Citations
“…of benign to malicious URLs, whereas in [28], [45], [49], [67], [77], [80] researchers evaluated their classifiers on highly unbalanced datasets (the benign dataset size is almost 50 times or more) but fixed ratios. Papers [35], [47], [49], [55], [69] report moderate benign-to-malicious data ratios (e.g., 2:1, 3:2, 5:1, etc.). Authors in [30], [43], [46], [54], [81]- [84] used slightly PhishTank (pub.)…”
Section: ) Dataset Sources and Availability: Inmentioning
confidence: 99%
“…c (pvt.) [21] [38], [72] [22], [41] [38], [41] [72], [77] a Gmail directory [34] b Spam URLs from UAB Spam DataMine email messages [38] c Private SpamTrap system [72], collected using blacklists and crowd-sourcing [41], [77] [78] a MalwareDomainslist [35] and malwareurl [48] b Malware private sources [78] unbalanced, or almost balanced, 9 data. Systems in [43], [59], [73] used more malicious URLs than legitimate ones.…”
Section: ) Dataset Sources and Availability: Inmentioning
confidence: 99%
See 3 more Smart Citations