Nowadays there are many DNS firewall solutions to prevent users accessing malicious domains. These can provide real-time protection and block illegitimate communications, contributing to the cybersecurity posture of the organizations. Most of these solutions are based on known malicious domain lists that are being constantly updated. However, in this way, it is only possible to block malicious communications for known malicious domains, leaving out many others that are malicious but have not yet been updated in the blocklists. This work provides a study to implement a DNS firewall solution based on ML and so improve the detection of malicious domain requests on the fly. For this purpose, a dataset with 34 features and 90 k records was created based on real DNS logs. The data were enriched using OSINT sources. Exploratory analysis and data preparation steps were carried out, and the final dataset submitted to different Supervised ML algorithms to accurately and quickly classify if a domain request is malicious or not. The results show that the ML algorithms were able to classify the benign and malicious domains with accuracy rates between 89% and 96%, and with a classification time between 0.01 and 3.37 s. The contributions of this study are twofold. In terms of research, a dataset was made public and the methodology can be used by other researchers. In terms of solution, the work provides the baseline to implement an in band DNS firewall.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.