“…Moreover, datasets should be designed and created such that they are applicable to both knowledge-based and behavior-based IIDS training (currently, no corresponding dataset is known to us), e.g., by including repetitions and variations of the same attack, providing sufficient long samples of benign behavior, and including novel attacks, which are not previously trained on, to avoid the drawing of false conclusions [48]. Another crucial factor is a high quality labelling of the dataset, which can be difficult do perform right [86], but is of utmost importance to accurately calculate a given metric. Lastly, reviews of datasets' quality [49], e.g., with statistical means accessing the data distribution and stability [79] or analysis of how easily a dataset can be "solved" [86], can guide developers in choosing a relevant dataset.…”