Owing to the growing concerns over data confidentiality, many national statistical agencies are considering remote access servers to disseminate data to the public. With remote servers, users submit requests for output from statistical models fit using the collected data, but they are not allowed access to the data. Remote servers also should enable users to check the fit of their models; however, standard diagnostics like residuals or influence statistics can disclose individual data values. In this article, we present diagnostics for categorical data regressions that can be safely and usefully employed in remote servers. We illustrate the diagnostics with simulation studies.
Statistical agencies that own different databases on overlapping subjects can benefit greatly from combining their data. These benefits are passed on to secondary data analysts when the combined data are disseminated to the public. Sometimes combining data across agencies or sharing these data with the public is not possible: one or both of these actions may break promises of confidentiality that have been given to data subjects. We describe an approach that is based on two stages of multiple imputation that facilitates data sharing and dissemination under restrictions of confidentiality. We present new inferential methods that properly account for the uncertainty that is caused by the two stages of imputation. We illustrate the approach by using artificial and genuine data. Copyright (c) 2009 Royal Statistical Society.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.