2015
DOI: 10.4236/ijcns.2015.85014
|View full text |Cite
|
Sign up to set email alerts
|

Improving Knowledge Based Spam Detection Methods: The Effect of Malicious Related Features in Imbalance Data Distribution

Abstract: Spam is no longer just commercial unsolicited email messages that waste our time, it consumes network traffic and mail servers' storage. Furthermore, spam has become a major component of several attack vectors including attacks such as phishing, cross-site scripting, cross-site request forgery and malware infection. Statistics show that the amount of spam containing malicious contents increased compared to the one advertising legitimate products and services. In this paper, the issue of spam detection is inves… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
21
0

Year Published

2016
2016
2022
2022

Publication Types

Select...
7
1
1

Relationship

2
7

Authors

Journals

citations
Cited by 30 publications
(21 citation statements)
references
References 11 publications
0
21
0
Order By: Relevance
“…Yang and Jungchen [9] presented "Uninterrupted Approaches for Spam Detection Based on SVM and AIS" which in eight methods have been studied and compared due to their speed and accuracy in spam detection. Alqatawna et al [10] presented an incremental learning algorithm based on group learning. Then a software of proposed algorithm for spam filtering has been discussed.…”
Section: History Of Researchmentioning
confidence: 99%
“…Yang and Jungchen [9] presented "Uninterrupted Approaches for Spam Detection Based on SVM and AIS" which in eight methods have been studied and compared due to their speed and accuracy in spam detection. Alqatawna et al [10] presented an incremental learning algorithm based on group learning. Then a software of proposed algorithm for spam filtering has been discussed.…”
Section: History Of Researchmentioning
confidence: 99%
“…For example in Al-Jarrah, et al [6], the authors proposed a new set of features extracted from the headers of e-emails, which were then used for training common classifiers. In Alqatawna, et al [7], the authors focused on extracting malicious-related features and studied the effect of these features on the effectiveness of different classifiers. In Ruan and Tan [8], the authors proposed various approaches for constructing features for e-mail spam filtering (i.e.…”
Section: Spam Featuresmentioning
confidence: 99%
“…is, am, are, etc.) X 4 Average word length X 5 Minimum word length X 6 Maximum word length X 7 Count of uppercase letters X 8 Count of lowercase letters X 9 Count of special characters X 10 Longest sequence of adjacent capital letters X 11 Count of spam words 1 X 12 Count of slang words 2 X 13 Count of semicolons X 14 Count of sentences (split by full stops) X 15 Count of alpha-numeric words X 16 Time units X 17 Links (i.e. number of tokens ending with {.net, .com, .jo, etc.}.…”
Section: Readability Featuresmentioning
confidence: 99%
“…Wang et al [9] used four different types of feature sets (user features, content features, Uni-Bi features, and sentiment features) and their combinations to validate Random Forests for Twitter spam detection, and the experimental results show feature combination outperformed a single type of features on decision making. The study of Alqatawna et al [16] showed that adding malicious related features to training data significantly improved the detection of spam emails. Recently, He et al [17] investigated spam detection through analysing the features in email and message spams with a RBF-kernal SVM spam detector, thus to provide clues of spams to users.…”
Section: Introductionmentioning
confidence: 99%