2020
DOI: 10.48550/arxiv.2012.09344
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Machine Learning for Detecting Data Exfiltration: A Review

Bushra Sabir,
Faheem Ullah,
M. Ali Babar
et al.

Abstract: Research at the intersection of cybersecurity, Machine Learning (ML), and Software Engineering (SE) has recently taken significant steps in proposing countermeasures for detecting sophisticated data exfiltration attacks. It is important to systematically review and synthesize the ML-based data exfiltration countermeasures for building a body of knowledge on this important topic. Objective: This paper aims at systematically reviewing ML-based data exfiltration countermeasures to identify and classify ML approac… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
7
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
2

Relationship

2
0

Authors

Journals

citations
Cited by 2 publications
(7 citation statements)
references
References 96 publications
0
7
0
Order By: Relevance
“…Preprocessed data is passed to a feature engineering module where data is transformed into features (numeric form) (step 2, Figure 5), which are used as input for the ML algorithms. The reason behind this is that ML algorithms can only work with numerical data [2,39,40,50]. Depending on the type, size and diversity of CTI, the data science team chooses a feature engineering approach.…”
Section: Threat Data Prediction Model Building Layermentioning
confidence: 99%
See 3 more Smart Citations
“…Preprocessed data is passed to a feature engineering module where data is transformed into features (numeric form) (step 2, Figure 5), which are used as input for the ML algorithms. The reason behind this is that ML algorithms can only work with numerical data [2,39,40,50]. Depending on the type, size and diversity of CTI, the data science team chooses a feature engineering approach.…”
Section: Threat Data Prediction Model Building Layermentioning
confidence: 99%
“…Depending on the nature of the training data, different models are built by a data science team. Traditionally a set of ML algorithms are applied to find an algorithm suitable for a specific dataset and user requirements [1,2,40]. To investigate the effectiveness of prediction models for the validation task, we considered a set of ML algorithms (e.g., Decision Tree, Naïve Bayes, K-Nearest Neighbours and Random Forest).…”
Section: Threat Data Prediction Model Building Layermentioning
confidence: 99%
See 2 more Smart Citations
“…Hence, it is evident that researchers have had a narrow focus on the use of AI/ML-based techniques to enhance the security of C3I systems. By studying contemporary literature and their future directions, we want to draw researchers and practitioners attention to explore and incorporate AI/ML-enabled technologies such as the use of ML for detecting data exfiltration [81], AI for national security [82], and AI/ML-enabled cybersecurity [83], [84].…”
Section: Future Research Areasmentioning
confidence: 99%