Phishing Site Detection Analysis Using Artificial Neural Network

Pratiwi, Mellisa; Lorosae, Teguh Ansyor; Wibowo, Ferry Wahyu

doi:10.1088/1742-6596/1140/1/012048

Cited by 8 publications

(3 citation statements)

References 4 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, this solution only achieved 92.5% detection accuracy during the experiment. In a separate study, Pratiwi et al [48] proposed a neural network approach using the same features available in the literature [39]. Even though it used the same features, only 83.4% detection accuracy was achieved during this study.…”

Section: ) Machine Learning-based Phishing Detectionmentioning

confidence: 96%

See 1 more Smart Citation

Combining Long-Term Recurrent Convolutional and Graph Convolutional Networks to Detect Phishing Sites Using URL and HTML

Ariyadasa

Fernando

2022

IEEE Access

View full text Add to dashboard Cite

Phishing, a well-known cyber-attack practice has gained significant research attention in the cyber-security domain for the last two decades due to its dynamic attacking strategies. Although different solutions have been exercised against phishing, phishing attacks have dramatically increased in the past few years. Recent studies have shown that machine learning has become prominent in the present antiphishing context, and the techniques like deep learning have extensively improved anti-phishing tools' detection ability. This paper proposes PhishDet, a new way of detecting phishing websites through Longterm Recurrent Convolutional Network and Graph Convolutional Network using URL and HTML features. PhishDet is the first of its kind, which uses the powerful analysis and processing capabilities of Graph Neural Network in the anti-phishing domain and recorded 96.42% detection accuracy, with a 0.036 false-negative rate. It is effective against zero-day attacks, and the average detection time which is 1.8 seconds could also be considered realistic. The feature selection of PhishDet is automatic and occurs inside the system, as PhishDet gradually learns URLs and HTML content features to handle constantly changing phishing attacks. This has outperformed similar solutions by achieving a 99.53% f1-score with a public benchmark dataset. However, PhishDet requires periodic retraining to maintain its performance over time. If such retraining could be facilitated, PhishDet could fight against phishers for a more extended period to safeguard Internet users from this Internet threat.

show abstract

Section: ) Machine Learning-based Phishing Detectionmentioning

confidence: 96%

“…The HTML content has provided some essential features when detecting phishing attacks [7], [9], [12], [13], [39], [47], [48]. Therefore, HTML content analysis is vital to detect phishing attacks, and HTMLDet is the responsible component for analysing HTML content in PhishDet.…”

Section: B Htmldetmentioning

confidence: 99%

Combining Long-Term Recurrent Convolutional and Graph Convolutional Networks to Detect Phishing Sites Using URL and HTML

Ariyadasa

Fernando

2022

IEEE Access

View full text Add to dashboard Cite

show abstract

“…18 presents the summary plots for class 0 (normal), class 1 (suspicious) and class 2 (phishing) for the UCI small data set. applications, as data phishing sites commonly have indicators as URL length, request a URL, the URL of anchor, SFH, submitting to email, SSL final state and abnormal URL, as observed by [141].…”

Section: Findings Validation and Best Models Interpretationmentioning

confidence: 99%

Addressing feature selection and extreme learning machine tuning by diversity-oriented social network search: an application for phishing websites detection

Bačanin

Živković

Antonijevic

et al. 2023

Complex Intell. Syst.

View full text Add to dashboard Cite

Feature selection and hyper-parameters optimization (tuning) are two of the most important and challenging tasks in machine learning. To achieve satisfying performance, every machine learning model has to be adjusted for a specific problem, as the efficient universal approach does not exist. In addition, most of the data sets contain irrelevant and redundant features that can even have a negative influence on the model’s performance. Machine learning can be applied almost everywhere; however, due to the high risks involved with the growing number of malicious, phishing websites on the world wide web, feature selection and tuning are in this research addressed for this particular problem. Notwithstanding that many metaheuristics have been devised for both feature selection and machine learning tuning challenges, there is still much space for improvements. Therefore, the research exhibited in this manuscript tries to improve phishing website detection by tuning extreme learning model that utilizes the most relevant subset of phishing websites data sets features. To accomplish this goal, a novel diversity-oriented social network search algorithm has been developed and incorporated into a two-level cooperative framework. The proposed algorithm has been compared to six other cutting-edge metaheuristics algorithms, that were also implemented in the framework and tested under the same experimental conditions. All metaheuristics have been employed in level 1 of the devised framework to perform the feature selection task. The best-obtained subset of features has then been used as the input to the framework level 2, where all algorithms perform tuning of extreme learning machine. Tuning is referring to the number of neurons in the hidden layers and weights and biases initialization. For evaluation purposes, three phishing websites data sets of different sizes and the number of classes, retrieved from UCI and Kaggle repositories, were employed and all methods are compared in terms of classification error, separately for layers 1 and 2 over several independent runs, and detailed metrics of the final outcomes (output of layer 2), including precision, recall, f1 score, receiver operating characteristics and precision–recall area under the curves. Furthermore, an additional experiment is also conducted, where only layer 2 of the proposed framework is used, to establish metaheuristics performance for extreme machine learning tuning with all features, which represents a large-scale NP-hard global optimization challenge. Finally, according to the results of statistical tests, final research findings suggest that the proposed diversity-oriented social network search metaheuristics on average obtains better achievements than competitors for both challenges and all data sets. Finally, the SHapley Additive exPlanations analysis of the best-performing model was applied to determine the most influential features.

show abstract