2021
DOI: 10.1029/2020wr029001
|View full text |Cite
|
Sign up to set email alerts
|

The Abuse of Popular Performance Metrics in Hydrologic Modeling

Abstract: The purpose of this commentary is to critically evaluate performance metrics that are habitually used in hydrologic modeling. Our specific objectives are three-fold: (a) provide tools to quantify the sampling uncertainty in performance metrics; (b) quantify the sampling uncertainty in the popular performance metrics across a large sample of catchments; (c) prescribe further research that is needed to improve the estimation, interpretation, and use of performance metrics in hydrologic modeling. Our overall inte… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
76
0
2

Year Published

2022
2022
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 131 publications
(79 citation statements)
references
References 78 publications
1
76
0
2
Order By: Relevance
“…In contrast, the modeling approach described here calibrates to specific aspects of the flow regime, avoiding such tradeoffs, and likely increasing model predictive accuracy of functional flow metrics. Hydrologic models are typically evaluated using a limited set of "goodness of fit" (GOF) criteria, such as r-squared and NSE, to compare predictions with paired observations (Clark et al, 2021). The performance assessment approach used in this study used a broader suite of criteria, including both GOF and measures that evaluate the degree to which the distributions of predictions align with observations.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…In contrast, the modeling approach described here calibrates to specific aspects of the flow regime, avoiding such tradeoffs, and likely increasing model predictive accuracy of functional flow metrics. Hydrologic models are typically evaluated using a limited set of "goodness of fit" (GOF) criteria, such as r-squared and NSE, to compare predictions with paired observations (Clark et al, 2021). The performance assessment approach used in this study used a broader suite of criteria, including both GOF and measures that evaluate the degree to which the distributions of predictions align with observations.…”
Section: Discussionmentioning
confidence: 99%
“…To assess model performance, we compared predicted FFM values with observations at sites excluded from model training. We restricted the assessment to sites with 20 or more observations (i.e., 20 years of record) and calculated several model performance criteria to limit the risk of flawed interpretation resulting from the use of a single performance metric (Clark et al, 2021). We calculated performance criteria that provided measures of both the dispersion and central tendency of model predictions in comparison to observed values.…”
Section: Functional Flow Componentmentioning
confidence: 99%
“…For both case studies we calibrated the HyMod model by minimizing the Nash‐Sutcliffe efficiency. It is well known that performance metrics are affected by significant sampling uncertainty (Barber et al., 2020; Clark et al., 2021). Lamontagne et al.…”
Section: Case Studiesmentioning
confidence: 99%
“…For both case studies we calibrated the HyMod model by minimizing the Nash-Sutcliffe efficiency. It is well known that performance metrics are affected by significant sampling uncertainty (Barber et al, 2020;Clark et al, 2021). Lamontagne et al (2020) have shown that estimation robustness may be improved by performing a preliminary logarithmic transformation of observed and simulated river flow data.…”
Section: Case Studiesmentioning
confidence: 99%
“…Thus, GR4J NSE and HBV N SE have higher median NSE efficiencies (0.71 and 0.64, respectively) than GR4J KGE and HBV KGE (0.66 and 0.63, respectively) on the evaluation (hindcast) period. Obtained results raise a question of the best metric that could serve the needs of all interested parties: professional community, government agencies, and general public [49]. Currently, NSE and KGE metrics are popular only within the hydrological community.…”
Section: Consistency Between Calibration and Evaluation Periodsmentioning
confidence: 99%