2017
DOI: 10.1175/mwr-d-16-0037.1
|View full text |Cite
|
Sign up to set email alerts
|

Detecting Improvements in Forecast Correlation Skill: Statistical Testing and Power Analysis

Abstract: The skill of weather and climate forecast systems is often assessed by calculating the correlation coefficient between past forecasts and their verifying observations. Improvements in forecast skill can thus be quantified by correlation differences. The uncertainty in the correlation difference needs to be assessed to judge whether the observed difference constitutes a genuine improvement, or is compatible with random sampling variations. A widely used statistical test for correlation difference is known to be… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
49
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
7
1

Relationship

1
7

Authors

Journals

citations
Cited by 44 publications
(49 citation statements)
references
References 30 publications
0
49
0
Order By: Relevance
“…The former measure is the absolute difference in correlation values of the INIT and NoINIT hindcasts against observation. It is the most commonly used skill measure to assess the contribution of the internally generated climate variability component to the multiannual predictive skill (Siegert et al 2017). However, it has been pointed out by some studies (Siegert et al 2017, Smith et al 2019 that using correlation difference for two highly correlated hindcasts underestimates the impact of initialization.…”
Section: Forecast Quality Assessmentmentioning
confidence: 99%
See 1 more Smart Citation
“…The former measure is the absolute difference in correlation values of the INIT and NoINIT hindcasts against observation. It is the most commonly used skill measure to assess the contribution of the internally generated climate variability component to the multiannual predictive skill (Siegert et al 2017). However, it has been pointed out by some studies (Siegert et al 2017, Smith et al 2019 that using correlation difference for two highly correlated hindcasts underestimates the impact of initialization.…”
Section: Forecast Quality Assessmentmentioning
confidence: 99%
“…It is the most commonly used skill measure to assess the contribution of the internally generated climate variability component to the multiannual predictive skill (Siegert et al 2017). However, it has been pointed out by some studies (Siegert et al 2017, Smith et al 2019 that using correlation difference for two highly correlated hindcasts underestimates the impact of initialization. Therefore, in order to determine whether initialization adds any additional information compared to the uninitialized forecast in the areas where both INIT and NoINIT return high correlation values, we apply the residual correlation methodology recently suggested by Smith et al (2019).…”
Section: Forecast Quality Assessmentmentioning
confidence: 99%
“…We use the Student distribution with N degrees of freedom to estimate the significance level of correlation, N being the effective number of independent data calculated following the method of von Storch and Zwiers (2001). The significance of the difference between two correlations is estimated using the methodology of Siegert et al (2016), which takes into account the dependence from sharing the same observations in both correlation coefficients. This method to assess the significance of the difference of two correlations also takes into account the independent number of data, which is necessary given the serial correlation typical of the time series considered.…”
Section: ) Skill Assessmentmentioning
confidence: 99%
“…To test a forecast system at the seasonal and/or decadal timescales, an ensemble of several tens of predictions is commonly required to ensure enough confidence in the evaluation of the skill (Siegert et al 2017). Hence, it is statistically not possible to test a forecast system by considering only the three predictions initialised the years of the last three major eruptions.…”
Section: Skill Related To Volcanic Forcingmentioning
confidence: 99%
“…The correlation difference between the hindcast including the volcanic forcing and the other experiments is significant at 95% when an inverted triangle is superimposed to the curve. As recommended in (Siegert et al 2017), the correlation between the different experiments is taken into account to compute the statistical significance of the skill differences correctly. These differences are never significant in (a).…”
Section: Skill Related To Volcanic Forcingmentioning
confidence: 99%