2020
DOI: 10.1049/sfw2.12006
|View full text |Cite
|
Sign up to set email alerts
|

Empirical studies on the impact of filter‐based ranking feature selection on security vulnerability prediction

Abstract: Security vulnerability prediction (SVP) can construct models to identify potentially vulnerable program modules via machine learning. Two kinds of features from different points of view are used to measure the extracted modules in previous studies. One kind considers traditional software metrics as features, and the other kind uses text mining to extract term vectors as features. Therefore, gathered SVP data sets often have numerous features and result in the curse of dimensionality. In this article, we mainly… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 16 publications
(6 citation statements)
references
References 48 publications
0
6
0
Order By: Relevance
“…Therefore, in our future academic research, we will aim to further improve the proposed PERR method by considering the weights of different dimensions to enlarge the application scope of PERR. In addition, we will continue to investigate the possibility of integrating our privacy-aware PERR solution with other classical privacy-preservation techniques, such as blockchain [31][32][33] , differential privacy [34,35] , anonymization [36] , and program code analyses [37,38] . Moreover, computation offloading is often necessary, especially in a big data environment [39][40][41][42][43][44][45] .…”
Section: Discussionmentioning
confidence: 99%
“…Therefore, in our future academic research, we will aim to further improve the proposed PERR method by considering the weights of different dimensions to enlarge the application scope of PERR. In addition, we will continue to investigate the possibility of integrating our privacy-aware PERR solution with other classical privacy-preservation techniques, such as blockchain [31][32][33] , differential privacy [34,35] , anonymization [36] , and program code analyses [37,38] . Moreover, computation offloading is often necessary, especially in a big data environment [39][40][41][42][43][44][45] .…”
Section: Discussionmentioning
confidence: 99%
“…respectively. Only two studies [46] and [73] used a combination of metrics and text features. Three studies [32], [33], and [39] have utilized patterns, and four studies [7], [13], [16], and [68] used code attributes.…”
Section: Table 3 Quality Assessment Questionsmentioning
confidence: 99%
“…The experimental research shows that this procedure cuts training time by roughly 68%. In [73], it is mentioned that SVP data sets frequently include several features, which leads to the dimensionality curse. Since other forms of feature selection methods have a high computational cost, the focus of this paper is on the effect of filter-based ranking feature selection (FRFS) approaches on SVP.…”
Section: Figure 3 Year-wise Distribution Of Research Publicationsmentioning
confidence: 99%
“…Many works have been done considering the evaluation of different static code analyzers (e.g., [55] for C/C++), but the number of works considering the analysis of the suitability of features generated by them for the purpose of vulnerability prediction is limited. In [56,57], empirical studies considering three open-source PHP web applications were conducted. They based their research on a dataset and twelve metrics introduced in [49].…”
Section: Related Workmentioning
confidence: 99%
“…They based their research on a dataset and twelve metrics introduced in [49]. In [56], they examined the performance of different software vulnerability prediction models in terms of effort-aware performance measures, in contrast to [57], where they considered the impact of Filter-based Ranking Feature Selection (FRFS) methods on vulnerability prediction. In [58], an empirical study was conducted to examine a security risk (assessed by the Androrisk application) prediction of Android applications based on 21 code metrics obtained using SonarQube and six machine learning algorithms.…”
Section: Related Workmentioning
confidence: 99%