2021 IEEE/ACM 43rd International Conference on Software Engineering: Companion Proceedings (ICSE-Companion) 2021
DOI: 10.1109/icse-companion52605.2021.00081
|View full text |Cite
|
Sign up to set email alerts
|

FlakeFlagger: Predicting Flakiness Without Rerunning Tests

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

5
63
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 13 publications
(68 citation statements)
references
References 36 publications
5
63
0
Order By: Relevance
“…Indeed, while reruns can affirm that a failure is due to flakiness by manifesting a test pass and fail for the same version, they do not allow us to affirm that a failure is legitimate. A previous study showed that up to 10,000 reruns can be required to discover flaky tests that have a low flake rate [1]. Hence, a legitimate failure in the case of Chromium can still be a false alert (flaky failure) that was not rerun enough to manifest.…”
Section: Test Historymentioning
confidence: 99%
See 3 more Smart Citations
“…Indeed, while reruns can affirm that a failure is due to flakiness by manifesting a test pass and fail for the same version, they do not allow us to affirm that a failure is legitimate. A previous study showed that up to 10,000 reruns can be required to discover flaky tests that have a low flake rate [1]. Hence, a legitimate failure in the case of Chromium can still be a false alert (flaky failure) that was not rerun enough to manifest.…”
Section: Test Historymentioning
confidence: 99%
“…Each failure in our dataset is denoted as an n-dimensional feature vector đť‘‹ = (đť‘Ą 1 , ..., đť‘Ą đť‘› ) where đť‘Ą đť‘– represents one feature. 𝑦 = {0, 1} indicates if the failure is from the false alert class (0) or from the legitimate failure class (1). Once all vectors are created, we randomly split our dataset by including 80% of it in the training set and 20% in the test set, conserving the class ratio in each subset (stratified).…”
Section: Failure Classifiermentioning
confidence: 99%
See 2 more Smart Citations
“…Other studies investigated tools and techniques that could help developers to cope with test flakiness. Automated tools, such as DeFlaker [11], iDFlakies [12], and FlakeFlagger [13] have been developed in order to detect flaky tests with a minimum number of test runs or re-runs. Unfortunately, these advances offer only partial solutions to the problem and may not fit well within the development systems and organisation constraints.…”
Section: Introductionmentioning
confidence: 99%