2007
DOI: 10.4304/jsw.2.3.43-55
|View full text |Cite
|
Sign up to set email alerts
|

Spam Email Classification using an Adaptive Ontology

Abstract: Abstract-Email has become one of the fastest and most economical forms of communication. However, the increase of email users has resulted in the dramatic increase of spam emails during the past few years. As spammers always try to find a way to evade existing filters, new filters need to be developed to catch spam. Ontologies allow for machine-understandable semantics of data. It is important to share information with each other for more effective spam filtering. Thus, it is necessary to build ontology and a … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
11
0

Year Published

2009
2009
2020
2020

Publication Types

Select...
3
3
3

Relationship

2
7

Authors

Journals

citations
Cited by 20 publications
(11 citation statements)
references
References 44 publications
(25 reference statements)
0
11
0
Order By: Relevance
“…In its simplest form, text classification is binary in nature, such as the categorization of spam vs. non-spam email (Youn and McLeod, 2007). Researchers from distinct fields are exposed to a wide range of text classification problems.…”
Section: Introductionmentioning
confidence: 99%
“…In its simplest form, text classification is binary in nature, such as the categorization of spam vs. non-spam email (Youn and McLeod, 2007). Researchers from distinct fields are exposed to a wide range of text classification problems.…”
Section: Introductionmentioning
confidence: 99%
“…• high volume: one has to deal with huge amount of existing training data, in million or even billion scale; • high velocity: new data often arrives sequentially and very rapidly, e.g., about 182.9 billion emails are sent/received worldwide every day according to an email statistic report by the Radicati Group [1]; • high dimensionality: there are a large number of features, e.g., for some spam email classification tasks, the length of the vocabulary list can go up from 10, 000 to 50, 000 or even to million scale; • high sparsity: many feature elements are zero, and the faction of active features is often small, e.g., the spam email classification study in [2] showed that accuracy saturates with dozens of features out of tens of thousands of features. The above characteristics present huge challenges for big data stream classification tasks when using conventional data stream classification techniques that are often restricted to batch learning setting.…”
Section: Introductionmentioning
confidence: 99%
“…Among the supervised approaches, we can mention the use of rule-based systems [Cohen, 1996], SupportVector Machines [Drucker et al, 1999, Kiritchenko and Matwin, 2011, Yoo et al, 2011, Bayesian networks [Sahami et al, 1998, Androutsopoulos et al, 2000, Sakkis et al, 2001, Isozaki et al, 2005, memory-based reasoning [Segal andKephart, 1999, Delany et al, 2005], decision trees [Youn and McLeod, 2007], linear logistic regression [Aberdeen et al, 2010], neural networks [Yu and Zhu, 2009] and semantic analysis methods [Park and An, 2010].…”
Section: Related Workmentioning
confidence: 99%