2008
DOI: 10.1016/j.eswa.2007.01.018
|View full text |Cite
|
Sign up to set email alerts
|

An incremental cluster-based approach to spam filtering

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
26
0

Year Published

2011
2011
2023
2023

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 43 publications
(26 citation statements)
references
References 12 publications
0
26
0
Order By: Relevance
“…The larger the value of z, the more significant is the difference between the two means and therefore the larger the value discriminatory power of the attribute in classifying the dataset as spam/legitimate. Sorting the list of input attributes in a descending order based on the values of their Z statistic gives the following ranking: {21, 25 Attributes toward the beginning of the list should prove more effective than attributes toward the end. However, this simple ranking approach does not take into account possible mutual correlation between various attributes when grouped together to classify the dataset.…”
Section: Description Of the Datasetmentioning
confidence: 99%
See 1 more Smart Citation
“…The larger the value of z, the more significant is the difference between the two means and therefore the larger the value discriminatory power of the attribute in classifying the dataset as spam/legitimate. Sorting the list of input attributes in a descending order based on the values of their Z statistic gives the following ranking: {21, 25 Attributes toward the beginning of the list should prove more effective than attributes toward the end. However, this simple ranking approach does not take into account possible mutual correlation between various attributes when grouped together to classify the dataset.…”
Section: Description Of the Datasetmentioning
confidence: 99%
“…Among these methods, Bayesian classifiers have been widely applied as one of the most effective spam filters [19,23]. Recent reviews and taxonomy of current and potential solutions for the spam problem are presented in [9,18,21,25,28,44]. Although empirical studies on existing spam filters have reported promising results in terms of detection accuracy, the number of false positives (legitimate messages classified as spam) is still unacceptably high.…”
Section: Introductionmentioning
confidence: 99%
“…Application of methods of clustering analyses to the problem of filtering e-mails to legitimate and spam is considered in papers [15][16][17][18]. From 2009 year, beginning from Paulo Cortez's, et al article [19] one can meet the statement as a Symbiotic Data Mining which is a hybrid of Collaborative Filtering (CF) and ContentBased Filtering (CBF).…”
Section: Historical Review Of Spam Filtering Methodsmentioning
confidence: 99%
“…For example, concept drift in textual data streams can be identified by monitoring word frequencies (Swan and Allan 1999) and the formation of new word clusters (Hsiao and Chang 2008;Spinosa et al 2007). Kifer et al (2004) introduce a more generic approach that uses a two window paradigm to detect changes in feature distribution.…”
Section: Triggered Rebuildmentioning
confidence: 99%