Deep learning has been achieving top performance in many tasks. Since training of a deep learning model requires a great deal of cost, we need to treat neural network models as valuable intellectual properties. One concern in such a situation is that some malicious user might redistribute the model or provide a prediction service using the model without permission. One promising solution is digital watermarking, to embed a mechanism into the model so that the owner of the model can verify the ownership of the model externally. In this study, we present a novel attack method against watermark, query modification, and demonstrate that all of the existing watermark methods are vulnerable to either of query modification or existing attack method (model modification). To overcome this vulnerability, we present a novel watermarking method, exponential weighting. We experimentally show that our watermarking method achieves high verification performance of watermark even under a malicious attempt of unauthorized service providers, such as model modification and query modification, without sacrificing the predictive performance of the neural network model.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.