2016
DOI: 10.5614/itbj.ict.res.appl.2016.10.1.3
|View full text |Cite
|
Sign up to set email alerts
|

Voting-based Classification for E-mail Spam Detection

Abstract: Abstract. The problem of spam e-mail has gained a tremendous amount of attention. Although entities tend to use e-mail spam filter applications to filter out received spam e-mails, marketing companies still tend to send unsolicited emails in bulk and users still receive a reasonable amount of spam e-mail despite those filtering applications. This work proposes a new method for classifying emails into spam and non-spam. First, several e-mail content features are extracted and then those features are used for cl… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2016
2016
2023
2023

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 16 publications
(8 citation statements)
references
References 22 publications
0
8
0
Order By: Relevance
“…This step involved converting email messages into a format that could be processed by a machine learning algorithm. Email spam features are obtained from three different methods, namely, the Heuristic approach, Term frequency (TF) analysis, and behavior approach [27]. In the first approach, emails are mined to discover and generate patterns and rules, while in the TF analysis; every word in an e-mail is specified as a feature.…”
Section: Feature Extractionmentioning
confidence: 99%
“…This step involved converting email messages into a format that could be processed by a machine learning algorithm. Email spam features are obtained from three different methods, namely, the Heuristic approach, Term frequency (TF) analysis, and behavior approach [27]. In the first approach, emails are mined to discover and generate patterns and rules, while in the TF analysis; every word in an e-mail is specified as a feature.…”
Section: Feature Extractionmentioning
confidence: 99%
“…This data set is available freely for research purposes (https://github.com/erayon/Email-spam-filter-naive-bayes-classifier-scikit-learntext-classification/tree/master/CSDMC2010_SPAM/CSDMC2010_SPAM, accessed 10 EL 38,3 January 2019). This data set has been used in earlier research studies (Al-Shboul et al, 2016;Hijawi et al, 2017;Liu and Moh, 2016;Mercer, 2013, 2016). Characteristics of the data set are shown in Table 4.…”
Section: Data Setmentioning
confidence: 99%
“…Vote ensemble utilizes several combination algorithms to makes it predictions. These combination rules include: Average Probabilities, Minimum Probabilities, Maximum Probabilities, Product of Probabilities, and Majority Voting [24], . It creates series of classifiers and then predicts based on either the mode or mean of the base classifiers.…”
Section: Vote Ensemblementioning
confidence: 99%
“…It creates series of classifiers and then predicts based on either the mode or mean of the base classifiers. Majority Voting has been used majorly for prediction as the output of the ensemble or classification is the label with the highest number of votes from the base classifiers [24]- [28]. It can also be weighted [30], that is, assigning more weight on classifiers which are more likely correct [31].…”
Section: Vote Ensemblementioning
confidence: 99%