Deep learning models for phishing URL classification based on character- and word-level URL features achieve the best performance in terms of accuracy. Various improvements have been proposed through deep learning parameters, including the structure and learning strategy. However, the existing deep learning approach shows a degradation in recall according to the nature of a phishing attack that is immediately discarded after being reported. An additional optimization process that can minimize the false negatives by selecting the core features of phishing URLs is a promising avenue of improvement. To search the optimal URL feature set and to fully exploit it, we propose a combined searching and learning strategy that effectively models the URL classifier for recall. By incorporating the deep-learning-based URL classifier with the genetic algorithm to search the optimal feature set that minimizing the false negatives, an optimized classifier that guarantees the best performance was obtained. Extensive experiments on three real-world datasets consisting of 222,541 URLs showed the highest recall among the deep learning models. We demonstrated the superiority of the method by 10-fold cross-validation and confirmed that the recall improved compared to the latest deep learning method. In particular, the accuracy and recall were improved by 4.13%p and 7.07%p, respectively, compared to the convolutional–recurrent neural network in which the feature selection optimization was omitted.
Web 2.0 could influence the teaching and learning system significantly due to its characteristics to utilize information using internet in various ways, to create information, and to reorganize it through information sharing. In this new environment of information-oriented classes using the computer, positive education method is required to develop new teaching/learning method based on the internet web 2.0 in order to fulfill the learner's intellectual curiosity and to lead the future-oriented classes. This paper proposed a teaching-learning models in the web 2.0-based internet information education and its effect analysis.
The deep learning-based URL classification approach using massive observations has been verified especially in the field of phishing attack detection. Various improvements have been achieved through the modeling of character and word sequence of URL based on convolutional and recurrent neural networks, and it has been proven that an ensemble approach of each model has the best performance. However, existing ensemble methods have limitations in effectively fusing the nonlinear correlation between heterogeneous features extracted from characters and the sequence of sub-domains. In this paper, we propose a convolutional network-based ensemble learning approach to systematically fuse syntactic and semantic features for phishing URL detection. By learning the weights that integrating the heterogeneous features extracted from the URL, an ensemble rule that guarantees the best performance was obtained. A total of 45,000 benign URLs and 15,000 phishing URLs were collected and 10-fold cross-validation was conducted for quantitative validation. The obtained classification accuracy of 0.9804 indicates that the proposed method outperforms the existing machine learning algorithms and provides plausible solution for phishing URL detection. We demonstrated the superiority of the proposed method by receiver-operating characteristic (ROC) curve analysis and the case analysis and confirmed that the accuracy improved by 1.93% compared to the latest deep model.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.