2013
DOI: 10.5120/13145-0549
|View full text |Cite
|
Sign up to set email alerts
|

Designing Spam Model- Classification Analysis using Decision Trees

Abstract: A spam has diluted the message pool, causing frustration so require an automatic processing of emails. This study is to construct a spam model using classification technique in data mining. To accomplish this, experiments were conducted on spam dataset downloaded from the UCI machine learning repository which was classified using a popular data mining tool called WEKA. The final classification result should be "1" if it is finally spam, otherwise, it should be "0".Email is popular mode of communication and its… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 7 publications
(5 citation statements)
references
References 7 publications
0
5
0
Order By: Relevance
“…We can see how the number of leaves and size decreases by the converging of the confidence factor to a much smaller number, but the accuracy almost remained the same. However, lowering the confidence factor means we have less confidence in our training data (Rajput & Arora, 2013); therefore, the confidence factor was set fixed at 0.001. In addition, as mentioned in (Patel & Upadhyay, 2012), by increasing the minNumObj the size of the tree and number of leaves decreases dramatically with a very small amount of compromise on the accuracy, which can be seen in Table 3.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…We can see how the number of leaves and size decreases by the converging of the confidence factor to a much smaller number, but the accuracy almost remained the same. However, lowering the confidence factor means we have less confidence in our training data (Rajput & Arora, 2013); therefore, the confidence factor was set fixed at 0.001. In addition, as mentioned in (Patel & Upadhyay, 2012), by increasing the minNumObj the size of the tree and number of leaves decreases dramatically with a very small amount of compromise on the accuracy, which can be seen in Table 3.…”
Section: Resultsmentioning
confidence: 99%
“…The minimum number of objects prevents the making of a new branch until the nodes in the branch are equal or greater than the specified threshold. Thus, this is a pre-pruning strategy (Drazin & Montag, 2012;Han et al, 2011;Rajput & Arora, 2013;Witten et al, 2016). Besides the above three options, all the rest of the options were left as default.…”
Section: Developing the Decision Tree With J48mentioning
confidence: 99%
“…Cases of text processing techniques are stopword removal and tokenization. The common classification techniques for document analysis include Support Vector Machine (Elmurngi and Gherbi, 2017), Naive Bayes (Zhang and Li, 2007), Logistic Regression (Cheng and Hüllermeier, 2009), Decision Tree (Rajput and Arora, 2013).…”
Section: Related Workmentioning
confidence: 99%
“…Lisong Pei, Jakob Schütte, Carlos Simon in 2007-10-07 explained that there are two basic complementary trends in intrusion detection knowledge based, In knowledge base the knowledge about the attacks are taken for the detection of attacks [17].…”
Section: Related Researchmentioning
confidence: 99%
“…It decision trees are build using learning samples that are actually data from history with certain pre assigned classes. The cost of the decision tree so obtained is pruned by cost complexity pruning and the splits of the decision tree are selected using gini index [17]. Here, data is bifurcated into two subsets in such a way that each subset contains more homogeneous records than the previous subsets.…”
Section: Classification and Regression Treementioning
confidence: 99%