A kernel estimator (KQ) of the quantile function is presented here. Boundary kernels are used for extrapolation of tail quantiles. The bandwidth of the estimator is chosen using an automatic, “plug‐in” method. Confidence intervals for the estimated quantile are estimated by bootstrapping. Comparisons of the estimator with selected tail probability estimators are offered. The KQ estimator presented here is shown to be competitive with other estimators.
Abstract. We present a nonparametric approach based on local polynomial regression for ensemble forecast of time series. The state space is first reconstructed by embedding the univariate time series of the response variable in a space of dimension (D) with a delay time (τ ). To obtain a forecast from a given time point t, three steps are involved: (i) the current state of the system is mapped on to the state space, known as the feature vector, (ii) a small number (K = α * n, α=fraction (0,1] of the data, n=data length) of neighbors (and their future evolution) to the feature vector are identified in the state space, and (iii) a polynomial of order p is fitted to the identified neighbors, which is then used for prediction.
Relationships between hydrologic variables are often nonlinear. Usually, the functional form of such a relationship is not known a priori. A multivariate, nonparametric regression methodology is provided here for approximating the underlying regression function using locally weighted polynomials. Locally weighted polynomials consider the approximation of the target function through a Taylor series expansion of the function in the neighborhood of the point of estimate. Cross‐validatory procedures for the selection of the size of the neighborhood over which this approximation should take place and for the order of the local polynomial to use are provided and shown for some simple situations. The utility of this nonparametric regression approach is demonstrated through an application to nonparametric short‐term forecasts of the biweekly Great Salt Lake volume. Blind forecasts up to 1 year in the future using the 1847–2004 time series of the Great Salt Lake are presented.
Kernel density estimation methods have recently been introduced as viable and flexible alternativesto parametric methods for flood frequency estimation. Key properties of such estimators are reviewed in this paper. Attention is focused on the selection of the kernel function and the bandwidth. These are the parameters of the method. Existing techniques for kernel and bandwidth selection are applied to three situations: Gaussian data, skewed data (three-parameter gamma), and mixture data. The intent was to investigate issues relevant to parameter estimation as well as to the likely performance of these methods with the small sample sizes typical in hydrology. Bandwidths chosen by minimizing a performance criterion related to the distribution function lead to much smaller mean square errors of tail probabilities than those chosen by cross-validation methods designed for density estimation. However, this can lead to estimates that degenerate to the empirical distribution function, and hence to an unusable flood frequency curve. Variable bandwidths with heavy tailed kernels appear to do best. Kernel estimators are increasingly more competitive in terms of mean square error of estimate as the underlying distribution gets more complex. INTRODUCTIONEstimating exceedance frequencies of annual maximum flood events at a gaged site is a classical problem of hydrology. A finite data set (usually 20-100 points) is used to extrapolate flood magnitudes corresponding to recurrence intervals of up to 1000 years. A "curve-fitting" or parametric approach is traditionally used for the purpose. An a priori choice of a probability distribution function (e.g., log Pearson type iii) is made, and its parameters are estimated using one of several methods (e.g., moments, entropy, or likelihood maximization). Despite intensive research and legislation, no particular curve has emerged as a clear "winner" across different sites. Indeed, for the typical sample sizes given above, methods (e.g., the Kolmogorov-Smirnov test or the chi-square test) for selecting between probability distributions at a site cannot discriminate among candidate families of distributions and among members of a family [e.g., Kite, 1977]. A comparison between a variable kernel estimate (VK-C-AC) of the cumulative distribution function tCDF) for the St. Mary's River data used by Kite [1977] and Kite's results for a variety of parametric distributions is presented in Figure 1. The kernel estimate is close to the empirical CDF, and none of the parametric alternatives appear reasonable. The empirical CDF and the kernel estimator suggest a bimodal probability density for the data. A hydrologist forced to choose between the parametric alternatives would find from Kite that he cannot discriminate among two-parameter lognormal, three-parameter lognormal, type I extremal, Pearson type III, and log Pearson type III on the basis of standard tests. However, none of these distributions provides a fit consistent with the empirical distribution function. There are often a number of cau...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.