Replication of empirical studies in software engineering research: a systematic mapping study

Silva, Fabio Q.; Suassuna, Marcos; França, A. César C.; Grubb, Alicia M.; Gouveia, Tatiana B.; Monteiro, Cleviton V. F.; Santos, Igor Ebrahim

doi:10.1007/s10664-012-9227-7

Cited by 57 publications

(90 citation statements)

References 31 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The software engineering community has been readily embracing replications (e.g., [9,25,38]). There are two important factors characterizing a replication: procedure (i.e., followed experimental steps) and researcher (i.e., who conducted the replication).…”

Section: External Replications In Sementioning

confidence: 99%

“…An internal replication is conducted by the same group of researchers as the baseline experiment [30], while an external replication is performed by different experimenters. da Silva et al systematic review [9] showed that the majority of replications between 1994 and 2010 were internal. Alongside, internal replications tend to report positive results not only in SE [7], but also in other disciplines [1].…”

Section: External Replications In Sementioning

confidence: 99%

See 1 more Smart Citation

An External Replication on the Effects of Test-driven Development Using a Multi-site Blind Analysis Approach

Fucci

Scanniello

Romano

et al. 2016

Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement

View full text Add to dashboard Cite

Context: Test-driven development (TDD) is an agile practice claimed to improve the quality of a software product, as well as the productivity of its developers. A previous study (i.e., baseline experiment) at the University of Oulu (Finland) compared TDD to a test-last development (TLD) approach through a randomized controlled trial. The results failed to support the claims. Goal: We want to validate the original study results by replicating it at the University of Basilicata (Italy), using a different design. Method: We replicated the baseline experiment, using a crossover design, with 21 graduate students. We kept the settings and context as close as possible to the baseline experiment. In order to limit researchers bias, we involved two other sites (UPM, Spain, and Brunel, UK) to conduct blind analysis of the data. Results: The Kruskal-Wallis tests did not show any significant difference between TDD and TLD in terms of testing effort (p-value = .27 ), external code quality (pvalue = .82 ), and developers' productivity (p-value = .83 ). Nevertheless, our data revealed a difference based on the order in which TDD and TLD were applied, though no carry over effect. Conclusions: We verify the baseline study results, yet our results raises concerns regarding the selection of experimental objects, particularly with respect to their interaction with the order in which of treatments are applied.We recommend future studies to survey the tasks used in experiments evaluating TDD. Finally, to lower the cost of replication studies and reduce researchers' bias, we encourage other research groups to adopt similar multi-site blind analysis approach described in this paper.

show abstract

Section: External Replications In Sementioning

confidence: 99%

Section: External Replications In Sementioning

confidence: 99%

An External Replication on the Effects of Test-driven Development Using a Multi-site Blind Analysis Approach

Fucci

Scanniello

Romano

et al. 2016

Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement

View full text Add to dashboard Cite

show abstract

“…There was not one replication in this sample (Zannier et al 2006). A recently published paper (Silva et al 2012) found, from 1994 to 2010, 96 papers reporting 133 replications of 72 empirical studies (including not only experiments, but also case studies, surveys and others). The baseline studies were replicated on average 1.8 (133/72) times.…”

Section: Background On Replicationsmentioning

confidence: 74%

“…For experimental replications to have scientific value comparable to that of other types of empirical studies, they must be published in the peer-reviewed literature. (Silva et al 2012) There have traditionally been limited opportunities to publish peer-reviewed replication papers in journals. It has been argued (Kitchenham 2008) that publishing isolated replications is hard.…”

Section: Background On Replicationsmentioning

confidence: 99%

Replications of software engineering experiments

et al. 2013

View full text Add to dashboard Cite

Replication is an essential part of the experimental paradigm and is considered the main component of scientific knowledge. There are many open issues that must be addressed before the replication process can be successfully formalized in empirical software engineering research. The software engineering community learns a great deal from performing replications, reading reports of replications performed by others and aggregating the results of replications to draw deeper conclusions that would otherwise be possible. Experimental replications need to be published in the peer-reviewed literature to have scientific value comparable to that of other types of empirical studies. Significant efforts have been made to draw attention to the importance of publishing replications to advance the experimental research paradigm within software engineering and to provide a number of examples of such replications

show abstract

“…They found a total of 113 controlled experiments, of which 20 (17.7%) are described as replications. Silva et al [65] have conducted a systematic review of SE replications. They found 96 papers reporting 133 replications of 72 original studies run from 1994 to 2010.…”

Section: Introductionmentioning

confidence: 99%

Understanding replication of experiments in software engineering: A classification

Gómez

Juristo

Vegas

2014

Information and Software Technology

110

117

View full text Add to dashboard Cite

Context: Replication plays an important role in experimental disciplines. There are still many uncertain-ties about how to proceed with replications of SE experiments. Should replicators reuse the baseline experiment materials? How much liaison should there be among the original and replicating experiment-ers, if any? What elements of the experimental configuration can be changed for the experiment to be considered a replication rather than a new experiment? Objective: To improve our understanding of SE experiment replication, in this work we propose a classi-fication which is intend to provide experimenters with guidance about what types of replication they can perform. Method: The research approach followed is structured according to the following activities: (1) a litera-ture review of experiment replication in SE and in other disciplines, (2) identification of typical elements that compose an experimental configuration, (3) identification of different replications purposes and (4) development of a classification of experiment replications for SE. Results: We propose a classification of replications which provides experimenters in SE with guidance about what changes can they make in a replication and, based on these, what verification purposes such a replication can serve. The proposed classification helped to accommodate opposing views within a broader framework, it is capable of accounting for less similar replications to more similar ones regarding the baseline experiment. Conclusion: The aim of replication is to verify results, but different types of replication serve special ver-ification purposes and afford different degrees of change. Each replication type helps to discover partic-ular experimental conditions that might influence the results. The proposed classification can be used to identify changes in a replication and, based on these, understand the level of verification.

show abstract

Replication of empirical studies in software engineering research: a systematic mapping study

Cited by 57 publications

References 31 publications

An External Replication on the Effects of Test-driven Development Using a Multi-site Blind Analysis Approach

An External Replication on the Effects of Test-driven Development Using a Multi-site Blind Analysis Approach

Replications of software engineering experiments

Understanding replication of experiments in software engineering: A classification

Contact Info

Product

Resources

About