This paper explores the problem of the construction of imputation classes using the score method, sometimes called predictive mean stratification or response propensity stratification, depending on the context. This method was studied in Thomsen (1973), Little (1986) and Eltinge & Yansaneh (1997). We use a different framework to evaluate the properties of the resulting imputed estimator of a population mean. In our framework, we condition on the realized sample. This enables us to considerably simplify our theoretical developments in the frequent situation where the boundaries and the number of classes are sample-dependent. We find that the key factor for reducing the non-response bias is to form classes homogeneous with respect to the response probabilities and/or the conditional expectation of the variable of interest. In the latter case, the non-response/imputation variance is also reduced. Finally, we performed a simulation study to fully evaluate various versions of the score method and to compare them with a cross-classification method, which is frequently used in practice. The results showed the superiority of the score method in general. Copyright 2007 The Authors. Journal compilation (c) 2007 International Statistical Institute.
We argue that the conditional bias associated with a sample unit can be a useful measure of influence in finite population sampling. We use the conditional bias to derive robust estimators that are obtained by downweighting the most influential sample units. Under the model-based approach to inference, our proposed robust estimator is closely related to the well-known estimator of Chambers (1986). Under the design-based approach, it possesses the desirable feature of being applicable with most sampling designs used in practice. For stratified simple random sampling, it is essentially equivalent to the estimator of Kokic & Bell (1994). The proposed robust estimator depends on a tuning constant. In this paper, we propose a method for determining the tuning constant and show that the resulting estimator is consistent. Results from a simulation study suggest that our approach improves the efficiency of standard nonrobust estimators when the population contains units that may be influential if selected in the sample.
The validity of design-based inference is not dependent on any model assumption. However, it is well known that estimators derived through design-based theory may be inefficient for the estimation of population totals when the design weights are weakly related to the variables of interest and have widely dispersed values. We propose estimators that have the potential to improve the efficiency of any estimator derived under the design-based theory. Our main focus is limited to the improvement of the Horvitz-Thompson estimator, but we also discuss the extension to calibration estimators. The new estimators are obtained by smoothing design or calibration weights using an appropriate model. Our approach to inference requires the modelling of only one variable, the weight, and it leads to a single set of smoothed weights in multipurpose surveys. This is to be contrasted with other model-based approaches, such as the prediction approach, in which it is necessary to postulate and validate a model for each variable of interest leading potentially to variable-specific sets of weights. Our proposed approach is first justified theoretically and then evaluated through a simulation study.
We propose to use calibrated imputation to compensate for missing values. This technique consists of finding final imputed values that are as close as possible to preliminary imputed values and are calibrated to satisfy constraints. Preliminary imputed values, potentially justified by an imputation model, are obtained through deterministic single imputation. Using appropriate constraints, the resulting imputed estimator is asymptotically unbiased for estimation of linear population parameters such as domain totals. A quasi-model-assisted approach is considered in the sense that inferences do not depend on the validity of an imputation model and are made with respect to the sampling design and a non-response model. An imputation model may still be used to generate imputed values and thus to improve the efficiency of the imputed estimator. This approach has the characteristic of handling naturally the situation where more than one imputation method is used owing to missing values in the variables that are used to obtain imputed values. We use the Taylor linearization technique to obtain a variance estimator under a general non-response model. For the logistic non-response model, we show that ignoring the effect of estimating the non-response model parameters leads to overestimating the variance of the imputed estimator. In practice, the overestimation is expected to be moderate or even negligible, as shown in a simulation study. Copyright 2005 Royal Statistical Society.
Résumé Nous étudions la technique du bootstrap généralisé pour des plans de sondage généraux. Nous nous concentrons principalement sur l’estimation bootstrap de la variance mais nous étudions également les propriétés empiriques des intervalles de confiance bootstrap obtenus en utilisant la méthode des percentiles. Le bootstrap généralisé consiste à générer aléatoirement des poids bootstrap de telle sorte que les deux (ou plus) premiers moments selon le plan de l’erreur d’échantillonnage soient approchés par leurs moments correspondants selon le mécanisme bootstrap. On peut voir la plupart des méthodes bootstrap dans la littérature comme étant des cas particuliers du bootstrap généralisé. Nous discutons de considérations telles que le choix de la distribution utilisée pour générer les poids bootstrap, le choix du nombre de répliques bootstrap et la présence possible de poids bootstrap négatifs. Nous décrivons d’abord le bootstrap généralisé pour l’estimateur linéaire de Horvitz‐Thompson et considérons ensuite les estimateurs non linéaires tels que ceux définis au moyen d’équations d’estimation. Nous développons également deux façons d’appliquer le bootstrap à l’estimateur par la régression généralisée du total d’une population. Nous étudions plus en profondeur le cas de l’échantillonnage de Poisson qui est souvent utilisé pour sélectionner des échantillons dans les enquêtes sur les indices de prix effectuées par les agences statistiques nationales dans le monde. Pour l’échantillonnage de Poisson, nous considérons une approche par pseudo‐population et montrons que les poids bootstrap qui en résultent capturent les trois premiers moments sous le plan de l’erreur d’échantillonnage. Nous utilisons une étude par simulation et un exemple avec des données d’enquêtes réelles pour illustrer la théorie.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.