2010
DOI: 10.1145/1754393.1754394
|View full text |Cite
|
Sign up to set email alerts
|

Detecting visually similar Web pages

Abstract: We propose a novel approach for detecting visual similarity between two Web pages. The proposed approach applies Gestalt theory and considers a Web page as a single indivisible entity. The concept of supersignals, as a realization of Gestalt principles, supports our contention that Web pages must be treated as indivisible entities. We objectify, and directly compare, these indivisible supersignals using algorithmic complexity theory. We illustrate our approach by applying it to the problem of detecting phishin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
5
0

Year Published

2012
2012
2023
2023

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 83 publications
(6 citation statements)
references
References 44 publications
0
5
0
Order By: Relevance
“…For the evaluation of the system we use a set of at least 1,000 phishing websites and a second set of legitimate website that have been impersonated by those websites mixed with the 1,000 most visited websites worldwide 4 . In a first step we will define the best detection thresholds using a subsample of the given websites.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…For the evaluation of the system we use a set of at least 1,000 phishing websites and a second set of legitimate website that have been impersonated by those websites mixed with the 1,000 most visited websites worldwide 4 . In a first step we will define the best detection thresholds using a subsample of the given websites.…”
Section: Discussionmentioning
confidence: 99%
“…They evaluated the detector against a set of 140 phishing websites and 27 real websites performing very well. Chen et al [4] use the rendered web page as input to a normalized compression distance compressor. With a test set of 320 phishing websites that target 16 different banking websites they showed that phishing websites are rated significantly closer to their originals than banking pages among themselves.…”
Section: Related Workmentioning
confidence: 99%
“…[2][3][4][5][6][7] Other solutions consider the textual and visual similarity of suspicious web pages and legitimate pages identified as potential targets of phishing attacks. [8][9][10][11][12][13] Many other solutions are based on machine learning models. [14][15][16][17][18][19][20][21][22][23] The main contributions offered by the literature are analyzed and discussed in a recent survey by Zieni et al 1 In the context of machine learning, the survey highlights that phishing detection approaches mainly differ for the features chosen to describe the properties of the websites and for the learning algorithms applied for the classification of the websites.…”
Section: Related Workmentioning
confidence: 99%
“…Some solutions focus on the creation and maintenance of lists of phishing and legitimate websites, that is, blacklists and whitelists 2–7 . Other solutions consider the textual and visual similarity of suspicious web pages and legitimate pages identified as potential targets of phishing attacks 8–13 . Many other solutions are based on machine learning models 14–23 …”
Section: Related Workmentioning
confidence: 99%
“…They test their algorithm against a set of 140 phishing websites and 27 real websites performing very well. Chen et al 3 used the rendered web page as input to a normalized compression distance compressor. They test that on a set of 320 phishing websites that target 16 different real banking sites, their work shows that phishing websites rated significantly closer to their originals.…”
Section: Related Workmentioning
confidence: 99%