Ranked set sampling (RSS) utilizes inexpensive auxiliary information about the ranking of the units in a sample to provide a more precise estimator of the population mean of the variable of interest Y, which is either difficult or expensive to measure. However, the ranking may not be perfect in most situations. In this paper, we assume that the ranking is done on the basis of a concomitant variable X. Regression-type RSS estimators of the population mean of Y will be proposed by utilizing this concomitant variable X in both the ranking process of the units and the estimation process when the population mean of X is known. When X has unknown mean, double sampling will be used to obtain an estimate for the population mean of X. It is found that when X and Y jointly follow a bivariate normal distribution, our proposed RSS regression estimator is more efficient than RSS and simple random sampling (SRS) naive estimators unless the correlation between X and Y is low (/rho/ < 0.4). Moreover, it is always superior to the regression estimator under SRS for all rho. When normality does not hold, this approach could still perform reasonably well as long as the shape of the distribution of the concomitant variable X is only slightly departed from symmetry. For heavily skewed distributions, a remedial measure will be suggested. An example of estimating the mean plutonium concentration in surface soil on the Nevada Test Site, Nevada, U.S.A., will be considered.
SUMMARYThe method of ranked set sampling is widely applicable in environmental research mainly in the estimation of the mean and distribution function of the variable of interest, Y. Ranking of the Ys by visual judgment may be imperfect sometimes. When the Ys are expensive to measure, it would be more convenient to determine the 'rankings' of the Ys by a concomitant variable, X, which is relatively easy and cheap to make measurements. The information carried in X is not utilized in all estimation methods available in the literature except in determining the rankings of Ys unless extra distributional or linearity assumptions are made. However, these assumptions may be too stringent in environmental research. Nonparametric estimators for the distribution function and the mean of Y utilizing the concomitant variable and auxiliary information in a ranked set sampling setup are proposed in this article. The estimators are robust to model misspecification, and the performance of the estimators is highly satisfactory, supported by some simulation studies. The estimators are applied to a real data set to estimate the mean and distribution function of plutonium concentration in surface soil on the Nevada Test Site, Nevada, U.S.A.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.