2019
DOI: 10.3844/jcssp.2019.384.394
|View full text |Cite
|
Sign up to set email alerts
|

Comparative Study of Data Mining Classification Techniques for Detection and Prediction of Phishing Websites

Abstract: Data mining is the process of discovering or extracting information from large amount of data that are stored in databases or datasets such as phishing dataset. Phishing is a vital web security problem that involves simulating legitimate websites to mislead online users in order to steal their sensitive information. This paper aims to detect and predict the type of the website to either legitimate or phishing class label. It investigates different data mining classifiers that are applied to the phishing datase… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 31 publications
0
2
0
Order By: Relevance
“…We used three machine learning algorithms in the classification process in this study, namely the K-NN, SVM, and Naive Bayes algorithms. In the K-NN algorithm, the K value (number of nearest neighbors) plays an essential role in the performance of the classification results; since it determines how much data should have similar characteristics [9]. As shown in the optimization of the K-NN algorithm using the certainty factor in determining students' careers, with 12 K values (K = 1 to K = 12), this study shows that the highest performance came from the K values of 3 and 4, while the worst from the K values of 12 [10].…”
Section: Imentioning
confidence: 99%
“…We used three machine learning algorithms in the classification process in this study, namely the K-NN, SVM, and Naive Bayes algorithms. In the K-NN algorithm, the K value (number of nearest neighbors) plays an essential role in the performance of the classification results; since it determines how much data should have similar characteristics [9]. As shown in the optimization of the K-NN algorithm using the certainty factor in determining students' careers, with 12 K values (K = 1 to K = 12), this study shows that the highest performance came from the K values of 3 and 4, while the worst from the K values of 12 [10].…”
Section: Imentioning
confidence: 99%
“…Accuracy could not be sufficient measurement to assess the classification results. Al-Shalabi [38] explained the importance of other measures for the classification quality including recall and precision.…”
Section: Evaluation Measuresmentioning
confidence: 99%