Advancements in weather forecast models and their enhanced resolution have led to substantially improved and more realistic-appearing forecasts for some variables. However, traditional verification scores often indicate poor performance because of the increased small-scale variability so that the true quality of the forecasts is not always characterized well. As a result, numerous new methods for verifying these forecasts have been proposed. These new methods can mostly be classified into two overall categories: filtering methods and displacement methods. The filtering methods can be further delineated into neighborhood and scale separation, and the displacement methods can be divided into features based and field deformation. Each method gives considerably more information than the traditional scores, but it is not clear which method(s) should be used for which purpose.A verification methods intercomparison project has been established in order to glean a better understanding of the proposed methods in terms of their various characteristics and to determine what verification questions each method addresses. The study is ongoing, and preliminary qualitative results for the different approaches applied to different situations are described here. In particular, the various methods and their basic characteristics, similarities, and differences are described. In addition, several questions are addressed regarding the application of the methods and the information that they provide. These questions include (i) how the method(s) inform performance at different scales; (ii) how the methods provide information on location errors; (iii) whether the methods provide information on intensity errors and distributions; (iv) whether the methods provide information on structure errors; (v) whether the approaches have the ability to provide information about hits, misses, and false alarms; (vi) whether the methods do anything that is counterintuitive; (vii) whether the methods have selectable parameters and how sensitive the results are to parameter selection; (viii) whether the results can be easily aggregated across multiple cases; (ix) whether the methods can identify timing errors; and (x) whether confidence intervals and hypothesis tests can be readily computed.
ABSTRACT:Research and development of new verification strategies and reassessment of traditional forecast verification methods has received a great deal of attention from the scientific community in the last decade. This scientific effort has arisen from the need to respond to changes encompassing several aspects of the verification process, such as the evolution of forecasting systems, or the desire for more meaningful verification approaches that address specific forecast user requirements. Verification techniques that account for the spatial structure and the presence of features in forecast fields, and which are designed specifically for high-resolution forecasts have been developed. The advent of ensemble forecasts has motivated the re-evaluation of some of the traditional scores and the development of new verification methods for probability forecasts. The expected climatological increase of extreme events and their potential socio-economical impacts have revitalized research studies addressing the challenges concerning extreme event verification. Verification issues encountered in the operational forecasting environment have been widely discussed, verification needs for different user communities have been identified, and models to assess the forecast value for specific users have been proposed. Proper verification practice and correct interpretation of verification statistics has been extensively promoted with recent publications and books, tutorials and workshops, and the development of open-source software and verification tools. This paper addresses some of the current issues in forecast verification, reviews some of the most recently developed verification techniques, and provides recommendations for future research.
Accurate prediction of rare high-impact events represents a major challenge for weather and climate forecasting. Assessment of the skill at forecasting such events is problematic because of the rarity of such events. Skill scores traditionally used to verify deterministic forecasts of rare binary events, such as the equitable threat score (ETS), have the disadvantage that they tend to zero for vanishingly rare events. This creates the misleading impression that rare events cannot be skilfully forecast no matter which forecasting system is used.This study presents a simple model for rare binary-event forecasts and uses it to demonstrate the trivial non-informative limit behaviour of several often-used scores such as ETS. The extreme dependency score (EDS) is proposed as a more informative alternative for the assessment of skill in deterministic forecasts of rare events. The EDS has the advantage that it can converge to different values for different forecasting systems and furthermore it does not explicitly depend upon the bias of the forecasting system.The concepts and scores are demonstrated using an example of 6-hourly precipitation total Met Office forecasts for Eskdalemuir in Scotland over the period 1998-2003.
Increased human activity in the Arctic calls for accurate and reliable weather predictions. This study presents an intercomparison of operational and/or high-resolution models in an attempt to establish a baseline for present-day Arctic short-range forecast capabilities for near-surface weather (pressure, wind speed, temperature, precipitation, and total cloud cover) during winter. One global model [the high-resolution version of the ECMWF Integrated Forecasting System (IFS-HRES)], and three high-resolution, limited-area models [Applications of Research to Operations at Mesoscale (AROME)-Arctic, Canadian Arctic Prediction System (CAPS), and AROME with Météo-France setup (MF-AROME)] are evaluated. As part of the model intercomparison, several aspects of the impact of observation errors and representativeness on the verification are discussed. The results show how the forecasts differ in their spatial details and how forecast accuracy varies with region, parameter, lead time, weather, and forecast system, and they confirm many findings from mid- or lower latitudes. While some weaknesses are unique or more pronounced in some of the systems, several common model deficiencies are found, such as forecasting temperature during cloud-free, calm weather; a cold bias in windy conditions; the distinction between freezing and melting conditions; underestimation of solid precipitation; less skillful wind speed forecasts over land than over ocean; and difficulties with small-scale spatial variability. The added value of high-resolution limited area models is most pronounced for wind speed and temperature in regions with complex terrain and coastlines. However, forecast errors grow faster in the high-resolution models. This study also shows that observation errors and representativeness can account for a substantial part of the difference between forecast and observations in standard verification.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.