2020
DOI: 10.1093/jssam/smaa005
|View full text |Cite
|
Sign up to set email alerts
|

Finding a Flexible Hot-Deck Imputation Method for Multinomial Data

Abstract: Detailed breakdowns on totals are often collected in surveys, such as a breakdown of total product sales by product type. These multinomial data are often sparsely reported with wide variability in proportions across units. In addition, there are often true zeros that differ across units even within industry; for example, one establishment sells jeans but not shoes, and another sells shoes but not socks. It is quite common to have large fractions of missing data for these detailed items, even when totals are r… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
9
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3

Relationship

2
5

Authors

Journals

citations
Cited by 7 publications
(10 citation statements)
references
References 9 publications
1
9
0
Order By: Relevance
“…Second, if the proportion of recipients to donors is large, then NNRI may repeatedly use the same donor, yielding insufficient variation within each imputation cell. Andridge et al (2021) propose a modification of the NNRI method that addresses this issue in a multiple imputation framework; it would be useful to develop a single imputation analogue. Third, in practice, many auxiliary variables can be used to determine nearest neighbors, in which case, dimension reduction is necessary to mitigate matching discrepancies.…”
Section: Discussionmentioning
confidence: 99%
See 3 more Smart Citations
“…Second, if the proportion of recipients to donors is large, then NNRI may repeatedly use the same donor, yielding insufficient variation within each imputation cell. Andridge et al (2021) propose a modification of the NNRI method that addresses this issue in a multiple imputation framework; it would be useful to develop a single imputation analogue. Third, in practice, many auxiliary variables can be used to determine nearest neighbors, in which case, dimension reduction is necessary to mitigate matching discrepancies.…”
Section: Discussionmentioning
confidence: 99%
“…The simulation varies in four factors: parametric distribution of the size (auxiliary) variable x i , the size of the finite population (N ), relationship of auxiliary variable and detail items (x i and y i ), and response propensity. The data generation is largely patterned after the realistic procedures described in Andridge, Bechtel, and Thompson (2021), with each separate process outlined below. The first population scenario ensures the compact and convex support requirements of Assumption 3.…”
Section: Simulation Studymentioning
confidence: 99%
See 2 more Smart Citations
“…The variance estimation is further complicated in our setting due to the potential shift in multinomial distributions as described above. Andridge et al (2021) investigates multiple imputations of proportions with nearest neighbour ratio hot deck imputation using the Approximate Bayesian Bootstrap, finding consistent underestimation of the variance. As in the cited reference, we assume that the set of details is reported or is missing; in practice, detail components that do not sum to their associated total and are not within a small raking tolerance are often treated as missing, and all detail items are imputed, regardless of their original reporting status.…”
Section: Introductionmentioning
confidence: 99%