2019
DOI: 10.1109/access.2019.2895614
|View full text |Cite
|
Sign up to set email alerts
|

An Empirical Study on the Effectiveness of Feature Selection for Cross-Project Defect Prediction

Abstract: Software defect prediction has attracted much attention of researchers in software engineering. At present, feature selection approaches have been introduced into software defect prediction, which can improve the performance of traditional defect prediction (known as within-project defect prediction, WPDP) effectively. However, the studies on feature selection are not sufficient for cross-project defect prediction (CPDP). In this paper, we use the feature subset selection and feature ranking approaches to expl… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
20
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
8
2

Relationship

1
9

Authors

Journals

citations
Cited by 40 publications
(20 citation statements)
references
References 35 publications
0
20
0
Order By: Relevance
“…For traditional module-level defect prediction, many studies use feature selection to improve prediction performance [66], [67]. Laradji et al [68] carefully combined ensemble learning with efficient feature selection to address the issues of correlation, feature irrelevance, and so on.…”
Section: B Feature Selection Methodsmentioning
confidence: 99%
“…For traditional module-level defect prediction, many studies use feature selection to improve prediction performance [66], [67]. Laradji et al [68] carefully combined ensemble learning with efficient feature selection to address the issues of correlation, feature irrelevance, and so on.…”
Section: B Feature Selection Methodsmentioning
confidence: 99%
“…Frequency References Data Normalization 11 [20], [55], [56], [72], [76], [79], [85], [88], [92], [104] [57], [80], [84], [87], [89] Data Normalization, and Feature Selection 4 [90], [91], [116], [117] Data Normalization, and Data Filtering 4 [58], [75], [82], [93] Data Imbalance, and Data Filtering 1 [94] Data Filtering, and Feature Selection 3 [31], [97], [100] Data Imbalance, and Feature Selection 1 [40] Data Normalization, Data Imbalance, and Data Filtering 5 [77], [78], [81], [83], [86] Data Normalization, Data Imbalance, and Feature Selection • deep belief network based on abstract syntax tree [108], [113] • correlation-based feature selection for feature subset selection [100], [111] • improved subclass discriminant analysis [61] • information flow algorithm [97] • feature selection using clusters of hybrid-data approach [59] • top-k feature subset based on number of occurrences of different metrics [109] • geodesic flow kernel feature selection [110] • similarity measure …”
Section: Techniquesmentioning
confidence: 99%
“…Saidi et al [17] present a feature selection method that incorporates the genetic algorithm (GA) and the Pearson correlation coefficient (PCC). Yu et al [18] examine the effectiveness of feature selection in CPDP using feature subset selection and feature ranking approaches. In contrast to conventional feature selection methods that only focus on finding a single discriminating feature, Mao and Yang [19] present a multilayer feature subset selection method that uses randomized searches and multilayer structures to select discriminative subsets.…”
Section: Literature Surveymentioning
confidence: 99%