2015
DOI: 10.1007/s11390-015-1575-5
|View full text |Cite
|
Sign up to set email alerts
|

A Hybrid Instance Selection Using Nearest-Neighbor for Cross-Project Defect Prediction

Abstract: Software defect prediction (SDP) is an active research field in software engineering to identify defect-prone modules. Thanks to SDP, limited testing resources can be effectively allocated to defect-prone modules. Although SDP requires sufficient local data within a company, there are cases where local data are not available, e.g., pilot projects. Companies without local data can employ cross-project defect prediction (CPDP) using external data to build classifiers. The major challenge of CPDP is different dis… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
84
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
9
1

Relationship

1
9

Authors

Journals

citations
Cited by 96 publications
(84 citation statements)
references
References 21 publications
0
84
0
Order By: Relevance
“…Instance and dataset selection methods have been explored in CPDP. These include relevancy filtering by Turhan et al [1] based on the euclidean distance measure, data distributional characteristics and meta-learners by He et al [8], clustering by Herbold [9] and selective learning by Ryu et al [10]. These studies however, do not consider a search based approach.…”
Section: Related Workmentioning
confidence: 99%
“…Instance and dataset selection methods have been explored in CPDP. These include relevancy filtering by Turhan et al [1] based on the euclidean distance measure, data distributional characteristics and meta-learners by He et al [8], clustering by Herbold [9] and selective learning by Ryu et al [10]. These studies however, do not consider a search based approach.…”
Section: Related Workmentioning
confidence: 99%
“…Tomasev et al [15] argued about hubness effect related to nearest neighbor that minority class instances are responsible for misclassification in high dimensional data unlike the fact that majority classes are mostly the reason of misclassification in low and medium dimensional data. Ryu et al [16] proposed HISNN, an instance based hybrid selection using nearest neighbor for cross project defect prediction. In this class imbalance is existed in source and target projects.…”
Section: Related Workmentioning
confidence: 99%
“…al. [15] proposed an instance hybrid selection using nearest neighbor (HISNN). In such cases class imbalance exists in source and target project distribution.…”
Section: Related Workmentioning
confidence: 99%