Wieger Coutinho scite author profile

Wieger Coutinho

3Publications

16Citation Statements Received

97Citation Statements Given

How they've been cited

How they cite others

Affiliations

Publications

Order By: Most citations

Calibrated Hot-Deck Donor Imputation Subject to Edit Restrictions

Coutinho¹,

Waal²,

Shlomo³

2013

View full text Add to dashboard Cite

A major challenge faced by basically all institutes that collect statistical data on persons, households or enterprises is that data may be missing in the observed data sets. The most common solution for handling missing data is imputation. Imputation is complicated owing to the existence of constraints in the form of edit restrictions that have to be satisfied by the data. Examples of such edit restrictions are that someone who is less than 16 years old cannot be married in the Netherlands, and that someone whose marital status is unmarried cannot be the spouse of the head of household. Records that do not satisfy these edits are inconsistent, and are hence considered incorrect. A further complication when imputing categorical data is that the frequencies of certain categories are sometimes known from other sources or have previously been estimated. In this article we develop imputation methods for imputing missing values in categorical data that take both the edit restrictions and known frequencies into account.

show abstract

Automatic Editing for Business Surveys: An Assessment of Selected Algorithms

Waal

Coutinho²

2007

View full text Add to dashboard Cite

Statistical offices are responsible for publishing accurate statistical information about many different aspects of society. This task is complicated considerably by the fact that data collected by statistical offices generally contain errors. These errors have to be corrected before reliable statistical information can be published. This correction process is referred to as statistical data editing. Traditionally, data editing was mainly an interactive activity with the aim to correct all data in every detail. For that reason the data editing process was both expensive and time-consuming. To improve the efficiency of the editing process it can be partly automated. One often divides the statistical data editing process into the error localisation step and the imputation step. In this article we restrict ourselves to discussing the former step, and provide an assessment, based on personal experience, of several selected algorithms for automatically solving the error localisation problem for numerical (continuous) data. Our article can be seen as an extension of the overview article by Liepins, Garfinkel & Kunnathur (1982). All algorithms we discuss are based on the (generalised) Fellegi-Holt paradigm that says that the data of a record should be made to satisfy all edits by changing the fewest possible (weighted) number of fields. The error localisation problem may have several optimal solutions for a record. In contrast to what is common in the literature, most of the algorithms we describe aim to find all optimal solutions rather than just one. As numerical data mostly occur in business surveys, the described algorithms are mainly suitable for business surveys and less so for social surveys. For four algorithms we compare the computing times on six realistic data sets as well as their complexity.

show abstract

Calibrated Hot Deck Imputation for Numerical Data Under Edit Restrictions

Waal¹,

Coutinho²,

Shlomo³

2017

View full text Add to dashboard Cite

We develop a non-parametric imputation method for item non-response based on the wellknown hot-deck approach. The proposed imputation method is developed for imputing numerical data that ensure that all record-level edit rules are satisfied and previously estimated or known totals are exactly preserved. We propose a sequential hot-deck imputation approach that takes into account survey weights. Original survey weights are not changed, rather the imputations themselves are calibrated so that weighted estimates will equal known or estimated population totals. Edit rules are preserved by integrating the sequential hot-deck imputation with Fourier-Motzkin elimination which defines the range of feasible values that can be used for imputation such that all record-level edits will be satisfied. We apply the proposed imputation method under different scenarios of random and nearest-neighbour hot-deck on two data sets: an annual structural business survey and a synthetically generated data set with a large proportion of missing data. We compare the proposed imputation methods to standard imputation methods based on a set of evaluation measures.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Wieger Coutinho

Calibrated Hot-Deck Donor Imputation Subject to Edit Restrictions

Automatic Editing for Business Surveys: An Assessment of Selected Algorithms

Calibrated Hot Deck Imputation for Numerical Data Under Edit Restrictions

Contact Info

Product

Resources

About