Open data plays a fundamental role in the 21th century by stimulating economic growth and by enabling more transparent and inclusive societies. However, it is always difficult to create new high-quality datasets with the required privacy guarantees for many use cases. This paper aims at creating a framework for releasing new open data while protecting the individuality of the users through a strict definition of privacy called differential privacy. Unlike previous work, this paper provides a framework for privacy preserving data publishing that can be easily adapted to different use cases, from the generation of time-series to continuous data, and discrete data; no previous work has focused on the later class. Indeed, many use cases expose discrete data or at least a combination between categorical and numerical values. Thanks to the latest developments in deep learning and generative models, it is now possible to model rich-semantic data maintaining both the original distribution of the features and the correlations between them. The output of this framework is a deep network, namely a generator, able to create new data on demand. We demonstrate the efficiency of our approach on real datasets from the French public administration and classic benchmark datasets.
The Cloud Adoption Risk Assessment Model is designed to help cloud customers in assessing the risks that they face by selecting a specific cloud service provider. It evaluates background information obtained from cloud customers and cloud service providers to analyze various risk scenarios. This facilitates decision making an selecting the cloud service provider with the most preferable risk profile based on aggregated risks to security, privacy, and service delivery. Based on this model we developed a prototype using machine learning to automatically analyze the risks of representative cloud service providers from the Cloud Security Alliance Security, Trust & Assurance Registry.
The geolocation of data stored and being processed in cloud is an important issue for many organisations due to obligations that require sensitive data to reside or be processed in particular countries. In this paper we introduce an approach, named VLOC, to verify the physical location of a virtual machine on which the customer applications and data are stored. VLOC is implemented as a software which is able to estimate the geolocation of itself and notify the corresponding user if the location is unauthorised. VLOC uses a number of arbitrary web-servers as external landmarks for localisation and employs network latency measurement for distance estimation. Due to the fluctuation in the network latency, VLOC employs a machine learning technique in order to adapt itself to various network latency tolerance. Different from most of geolocation estimation approaches, VLOC is installed inside the target host (inside the cloud). VLOC does not require special hardware nor a network of trusted landmarks. The experimental results shows the accuracy of VLOC is higher than other existing approaches.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.