2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'06) 2006
DOI: 10.1109/wi.2006.170
|View full text |Cite
|
Sign up to set email alerts
|

The Role of URLs in Objectionable Web Content Categorization

Abstract: By analyzing a set of access attempts by teenagers to pornographic websites, we found that more than half of them are image searches and visits to websites with little text information. It is obvious that textual content-based filters cannot correctly categorize such access attempts. This paper describes a novel URLbased objectionable content categorization approach and its application to web filtering. In this approach, we break the URL into a sequence of n-grams with a range of n's and then a machine learnin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2011
2011
2019
2019

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 24 publications
(13 citation statements)
references
References 5 publications
0
13
0
Order By: Relevance
“…They also used supervised learning algorithm to classify URL"s. This method is useful only for text based web sites. [2] In this paper Ammar Almomani, B.B.Gupta, Samer Atawneh A Meulenberg, and Eman Almomani have done work to prevent phishing email attacks. Various techniques are used to detect such type of phishing mails.…”
Section: Research Work Conducted On Web Content Filteringmentioning
confidence: 99%
“…They also used supervised learning algorithm to classify URL"s. This method is useful only for text based web sites. [2] In this paper Ammar Almomani, B.B.Gupta, Samer Atawneh A Meulenberg, and Eman Almomani have done work to prevent phishing email attacks. Various techniques are used to detect such type of phishing mails.…”
Section: Research Work Conducted On Web Content Filteringmentioning
confidence: 99%
“…Second, the URL‐blocking approach rejects any requests to URLs on the blacklist while accepting requests to URLs that are not blacklisted (Lee & Luh, 2008; Zhang, Qin, & Yan, 2006). This approach has the advantage of consuming no computational resource for content analysis.…”
Section: Literature Reviewmentioning
confidence: 99%
“…URL analysis can be done either manually or automatically. There are a number of reports indicating positive results with automatic URL classification based on different technologies such as keyword search (Zhang, Qin, & Yan, 2006) or n-gram analysis (Kan & Thi, 2005). In our observations, we selected to manually classify resources because the study is a one-time procedure aimed at capturing students' browsing patterns from actual usage logs.…”
Section: Data Observationsmentioning
confidence: 99%