When multiplicative noises are used to protect values of a sensitive attribute in a microdata, it is frequently assumed that data intruders use the noise-multiplied value to estimate the corresponding unobservable original value of a target record. In this paper, we show that, data intruders could easily construct another estimate instead of using the noise-multiplied value to attack an original value. The new estimate, namely "correlation-attack" estimate, is obtained by exploiting the potentially high correlation between the noise-multiplied data and the original data. We provide a detailed comparison between the two estimates (noise-multiplied value and the correlation-attack estimate) by comparing the mean squared errors of the two underlying estimators, and we propose that data providers should always assess the disclosure risks from both estimators when generating noise-multiplied data. Correspondingly, we propose a disclosure risk measure which could be used by data providers for noise generating variable selection during data masking stage. A simulation study is provided to illustrate how the disclosure risk measure could help with noise generating variable selection for masking a set of original data.Abstract When multiplicative noises are used to protect a sensitive attribute of records in a microdata, it is frequently assumed that data intruders use the noise-multiplied value to estimate the corresponding unobservable original value of a target record. In this paper, we show that, data intruders could use another estimate instead of using the noise-multiplied value to attack an original value. The new estimate, namely correlation-attack estimate, is obtained by exploiting the potentially high correlation between the noise-multiplied data and the original data. We provide a detailed comparison between the two estimates (noise-multiplied value and correlationattack estimate) by comparing the mean square errors of the two underlying estimators, and we propose that data providers should always assess the disclosure risks from both estimators when generating noise-multiplied data. Correspondingly, we propose a disclosure risk measure which could be used by data providers for noise generating variable selection during data masking stage. A simulation study is provided to illustrate how the disclosure risk measure could be used.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.