As a response to the increasing number of cyber threats, novel detection and prevention methods are constantly being developed. One of the main obstacles hindering the development and evaluation of such methods is the shortage of reference data sets. What is proposed in this work is a way of testing methods detecting network threats. It includes a procedure for creating realistic reference data sets describing network threats and the processing and use of these data sets in testing environments. The proposed approach is illustrated and validated on the basis of the problem of spam detection. Reference data sets for spam detection are developed, analysed and used to both generate the requested volume of simulated traffic and analyse it using machine learning algorithms. The tests take into
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.