Phishing is a security threat with serious effects on individuals as well as on the targeted brands. Although this threat has been around for quite a long time, it is still very active and successful. In fact, the tactics used by attackers have been evolving continuously in the years to make the attacks more convincing and effective. In this context, phishing detection is of primary importance. The literature offers many diverse solutions that cope with this issue and in particular with the detection of phishing websites. This paper provides a broad and comprehensive review of the state of the art in this field by discussing the main challenges and findings. More specifically, the discussion is centered around three important categories of detection approaches, namely, list-based, similarity-based and machine learning-based. For each category we describe the detection methods proposed in the literature together with the datasets considered for their assessment and we discuss some research gaps that need to be filled.
Phishing is a very dangerous security threat that affects individuals as well as companies and organizations. To fight the risks associated with this threat, it is important to detect phishing websites in a timely manner. Machine learning models work well for this purpose as they can predict phishing cases, using information on the underlying websites. In this paper, we contribute to the research on the detection of phishing websites by proposing an explainable machine learning model that can provide not only accurate predictions of phishing, but also explanations of which features are most likely associated with phishing websites. To this aim, we propose a novel feature selection model based on Lorenz Zonoids, the multidimensional extension of Gini coefficient. We illustrate our proposal on a real dataset that contains features of both phishing and legitimate websites.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.