2022
DOI: 10.48550/arxiv.2204.06815
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

deep-significance - Easy and Meaningful Statistical Significance Testing in the Age of Neural Networks

Abstract: A lot of Machine Learning (ML) and Deep Learning (DL) research is of an empirical nature. Nevertheless, statistical significance testing (SST) is still not widely used. This endangers true progress, as seeming improvements over a baseline might be statistical flukes, leading follow-up research astray while wasting human and computational resources. Here, we provide an easy-to-use package containing different significance tests and utility functions specifically tailored towards research needs and usability.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(6 citation statements)
references
References 19 publications
0
6
0
Order By: Relevance
“…To demonstrate how significant the superiority of the model trained on a signal from the central and front EEG channels (C3-M2, CZ-O1, F3-M2, and F4-M1) was over the back channels (O1-M2 and O2-M1), we employed the almost stochastic order (ASO) significance test [ 37 , 38 ], as implemented by [ 39 ]. The method compares scores from two deep learning models and by computing the significance score .…”
Section: Discussionmentioning
confidence: 99%
“…To demonstrate how significant the superiority of the model trained on a signal from the central and front EEG channels (C3-M2, CZ-O1, F3-M2, and F4-M1) was over the back channels (O1-M2 and O2-M1), we employed the almost stochastic order (ASO) significance test [ 37 , 38 ], as implemented by [ 39 ]. The method compares scores from two deep learning models and by computing the significance score .…”
Section: Discussionmentioning
confidence: 99%
“…To perform significance tests we used the Almost Stochastic Order (ASO) test [36,37] with a 95% confidence level (α = 0.05). Each ASO test outputs a violation error min which denotes the degree to which the hypothesis that "method A is always better than method B" is being violated.…”
Section: Significance Testsmentioning
confidence: 99%
“…Model comparisons are performed using the almost stochastic order test. Following the recommendation of Ulmer et al (2022), τ = 0.2 is used as a threshold for the significance of the superiority of one model over another. Moreover, the error level α = 0.05 is already included in the min calculation.…”
Section: Parameters Of Statistical Testsmentioning
confidence: 99%
“…The smaller min , the more certain is the superiority of model A. Here, τ = 0.2 is recommended as an acceptance limit to achieve a balance between false positives and false negatives (Ulmer et al (2022)), although the min result should always be included for the sake of transparency. Note, however, that min is not a p value, unlike other statistical tests, but the significance level α is already included in the calculation of min .…”
Section: Structurementioning
confidence: 99%