The need for operational reasoning in data-driven rating curve prediction of suspended sediment. Hydrological Processes, 26 (26). pp. 3982-4000. ISSN 10993982-4000. ISSN -1085 Access from the University of Nottingham repository: http://eprints.nottingham.ac.uk/28055/1/HYP%2011-0353%20R1.pdf
Copyright and reuse:The Nottingham ePrints service makes this work by researchers of the University of Nottingham available open access under the following conditions. This article is made available under the University of Nottingham End User licence and may be reused according to the conditions of the licence. For more details see: http://eprints.nottingham.ac.uk/end_user_agreement.pdf
A note on versions:The version presented here may differ from the published version or from the version of record. If you wish to cite this item you are advised to consult the publisher's version. Please see the repository url above for details on accessing the published version and note that access may require a subscription.For more information, please contact eprints@nottingham.ac.uk The need for operational reasoning in data-driven rating curve prediction of suspended sediment.
Journal: Hydrological Processes
AbstractThe use of data-driven modelling techniques to deliver improved suspended sediment rating curves has received considerable interest in recent years. Studies indicate an increased level of performance over traditional approaches when such techniques are adopted. However, closer scrutiny reveals that, unlike their traditional counterparts, data-driven solutions commonly include lagged sediment data as model inputs and this seriously limits their operational application. In this paper we argue the need for a greater degree of operational reasoning underpinning data-driven rating curve solutions and demonstrate how incorrect conclusions about the performance of a data-driven modelling technique can be reached when the model solution is based upon operationally-invalid input combinations. We exemplify the problem through the re-analysis and augmentation of a recent and typical published study which uses gene expression programming to model the rating curve. We compare and contrast the previously-published, solutions, whose inputs negate their operational application, with a range of newly developed and directly comparable traditional and data-driven solutions which do have operational value. Results clearly demonstrate that the performance benefits of the published gene expression programming solutions are dependent on the inclusion of operationally-limiting, lagged data inputs. Indeed, when operationallyinapplicable input combinations are discounted from the models, and the analysis is repeated, gene expression programming fails to perform as well as many simpler, more standard multiple linear regression, piecewise linear regression and neural network counterparts. The potential for overstatement of the benefits of the data-driven paradigm in rating curve studies is thus highlighted.
KeywordsSuspended sediment; data-driven; rati...