2006
DOI: 10.1007/11752912_12
|View full text |Cite
|
Sign up to set email alerts
|

Spam Detection Using Character N-Grams

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
12
0

Year Published

2008
2008
2022
2022

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 15 publications
(13 citation statements)
references
References 5 publications
1
12
0
Order By: Relevance
“…As seen in Tables , the obtained accuracies are in an acceptable range. The obtained accuracies are acceptable values based on the results reported in the literature . The distinguishable powers of three data sets are strongly related to each of the P L and P R parameters, which are important parameters for detection of micro–macro patterns in email texts.…”
Section: Resultssupporting
confidence: 64%
See 1 more Smart Citation
“…As seen in Tables , the obtained accuracies are in an acceptable range. The obtained accuracies are acceptable values based on the results reported in the literature . The distinguishable powers of three data sets are strongly related to each of the P L and P R parameters, which are important parameters for detection of micro–macro patterns in email texts.…”
Section: Resultssupporting
confidence: 64%
“…As seen in Table , optimum P L and P R values were ( P L = 1, P R = 7), ( P L = 6, P R = 2) and ( P L = 3, P R = 5) for the TREC, the Spamassasian, and the Ling‐Spam corpus data sets, respectively, while P = 8 . The obtained spam recall (95.2%, 93.2%, and 92.4% for three corpuses), which is an indication of filter effectiveness , and spam precision (95.2%, 93.3%, and 92.5%), which is an indication of filter safety , rates are also acceptable. As a summary, the higher the recall, the fewer spam messages pass, and the higher the precision, the less non‐spam messages blocked.…”
Section: Resultsmentioning
confidence: 81%
“…This may due to the insufficient number of recognizable features used in classification. In keyword-based methods, thousands of keywords are usually needed to gain satisfactory performance [22,23,30]; while in the proposed approach, only 26 features are used. It may be deduced that using behaviorbased features can sufficiently distinguish spam and ham than keyword-based features do.…”
Section: Experiments IIImentioning
confidence: 99%
“…Character N-grams as features have been investigated for spam messages in [16], where Support Vector Machine was used as the only classifier. The proposed method gave a precision of 99.60% with a recall of around 98%.…”
Section: Related Researchmentioning
confidence: 99%