Proceedings of the Evaluation and Assessment on Software Engineering 2019
DOI: 10.1145/3319008.3319009
|View full text |Cite
|
Sign up to set email alerts
|

Problems with Statistical Practice in Human-Centric Software Engineering Experiments

Abstract: Background Examples of questionable statistical practice, when published in high quality software engineering (SE) journals, may lead to novice researchers adopting incorrect statistical practices. Objective Our goal is to highlight issues contributing to poor statistical practice in human-centric SE experiments. Method We reviewed the statistical analysis practices used in the 13 papers that reported families of human-centric SE experiments and were published in high quality journals. Results Reviewed papers … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 13 publications
(10 citation statements)
references
References 52 publications
(62 reference statements)
0
10
0
Order By: Relevance
“…During data extraction, it became clear that many of our 13 primary studies, included experiments with crossover designs. Vegas et al (2016) warned that the terminology used to describe crossover designs was not used consistently, and we found exactly the same problem with our primary studies (Kitchenham et al 2019a). Therefore, we used the description of the experimental design provided by the authors to derive our own classification.…”
Section: Experimental Methods Used By the Primary Studies (Rq2)mentioning
confidence: 94%
“…During data extraction, it became clear that many of our 13 primary studies, included experiments with crossover designs. Vegas et al (2016) warned that the terminology used to describe crossover designs was not used consistently, and we found exactly the same problem with our primary studies (Kitchenham et al 2019a). Therefore, we used the description of the experimental design provided by the authors to derive our own classification.…”
Section: Experimental Methods Used By the Primary Studies (Rq2)mentioning
confidence: 94%
“…Finally, we also have observed some weaknesses in the experimenters' statistical knowledge. This problem was pointed out by other researchers, e.g., [17,28] previously. The ESEM community (and the overall SE community, as well) should establish measures to improve experimenters' statistical skills.…”
Section: Discussionmentioning
confidence: 85%
“…Given our focus on journals, we extracted data from: Transactions on Software Engineering (TSE), Transactions on Software Engineering and Methodology (TOSEM), Empirical Software Engineering (EMSE), Journal of Systems and Software (JSS), and Information and Software Technology (IST). The same sample of journals was used in previous studies by Kitchenham et al [52]. The main reason for selecting these five journals is that they are well-known and top-ranked software engineering journals focusing primarily on applied scientific contributions.…”
Section: Screening and Selection Of Papersmentioning
confidence: 99%
“…Even though they are a subset of existing peer-reviewed publication venues, they comprise five popular and top-ranked SE journals. Also, the same sample has been used in similar ESE studies [52]. An alternative would have been to include conference papers as well, but we argue that the limited number of pages available to papers in conferences could hinder researchers to report thoroughly on their empirical studies, e.g., not including enough details on the choice and usage of statistics.…”
Section: Threats To Validitymentioning
confidence: 99%
See 1 more Smart Citation