2009
DOI: 10.1007/s10664-008-9103-7
|View full text |Cite
|
Sign up to set email alerts
|

On the relative value of cross-company and within-company data for defect prediction

Abstract: We propose a practical defect prediction approach for companies that do not track defect related data. Specifically, we investigate the applicability of cross-company (CC) data for building localized defect predictors using static code features.Firstly, we analyze the conditions, where CC data can be used as is. These conditions turn out to be quite few. Then we apply principles of analogy-based learning (i.e. nearest neighbor (NN) filtering) to CC data, in order to fine tune these models for localization. We … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

28
613
1
4

Year Published

2017
2017
2023
2023

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 631 publications
(646 citation statements)
references
References 44 publications
28
613
1
4
Order By: Relevance
“…Two other publications of Khoshgoftaar and Seliya from 2004 [15] and 2005 [16] continued with the previous concept and focused on commercial data analysis, but were not applied to a real-world environment. A similar approach can be found in publications by Ostrand and Weyuker [29], Ostrand et al [31], Tosun et al [45], Turhan et al [47,48]. Examples of industrial applications of information gathered by using defect prediction can be found in publications by Wong et al [50], Succi et al [41] and Kläs et al [17].…”
Section: Related Workmentioning
confidence: 85%
“…Two other publications of Khoshgoftaar and Seliya from 2004 [15] and 2005 [16] continued with the previous concept and focused on commercial data analysis, but were not applied to a real-world environment. A similar approach can be found in publications by Ostrand and Weyuker [29], Ostrand et al [31], Tosun et al [45], Turhan et al [47,48]. Examples of industrial applications of information gathered by using defect prediction can be found in publications by Wong et al [50], Succi et al [41] and Kläs et al [17].…”
Section: Related Workmentioning
confidence: 85%
“…The performance of CCDP is generally poor because of larger irrelevant CC data. Previous work [23] found that using raw CC data directly would increase false alarm rates due to irrelevant instance in CC data, so several data filtering works should be done before building the prediction model. For example, Turhan et al [23] and Peters et al [24] proposed the NN filter and the Perters filter to select the CC instances which are mostly similar to WC data as the training dataset.…”
Section: Doi Reference Number: 1018293/seke2017-043mentioning
confidence: 99%
“…The first one is to apply the data filtering method to find the best suitable training data (e.g., [23,24,26]). For example, Turhan et al [23] proposed a nearest neighbor (NN) filter to select cross-company data.…”
Section: B Cross-company Defect Predictionmentioning
confidence: 99%
See 1 more Smart Citation
“…We also conducted some preliminary experiments to create a personality prediction model for a country. While there are many software engineering papers which utilize from cross-company or cross-project data such as cross-project defect prediction [25]- [31] and cross-company effort estimation [32]- [38], there is no cross-cultural personality prediction study yet. Therefore, we see a big research potential on the analysis of cross-cultural personality prediction models.…”
Section: Introductionmentioning
confidence: 99%