Data have become an important asset. Mining the value contained in personal data, making personal data an exchangeable commodity, has become a hot spot of industry research. Then, how to price personal data reasonably becomes a problem we have to face. Based on previous research on data provenance, this paper proposes a novel minimum provenance pricing method, which is to price the minimum source tuple set that contributes to the query. Our pricing model first sets prices for source tuples according to their importance and then makes query pricing based on data provenance, which considers both the importance of the data itself and the relationships between the data. We design an exact algorithm that can calculate the exact price of a query in exponential complexity. Furthermore, we design an easy approximate algorithm, which can calculate the approximate price of the query in polynomial time. We instantiated our model with a select-joint query and a complex query and extensively evaluated its performances on two practical datasets. The experimental results show that our pricing model is feasible.
The widespread acceptance of data sharing has promoted the relevant research in both academia and industry. ''Data sharing'' is the process of interchanging data among multiple data sources in a controllable access manner. Any such system provides common functionality of storage and access; however, there are prominent differences in non-functionalities, such as self-control, transparency, costeffectiveness, incentive, and auditing, which make data sharing in a dilemma. In this paper, we propose a distributed data sharing model based on blockchain technology: data owner publishes data with an additional control policy; data consumer inquires data with an access request; and consensus nodes evaluate the request and make decisions. After formalizing the system model, we discuss the procedure of data sharing, reputation management, and proposed solution. With the access control policy, price compensation, and reputation management mechanism, we achieve self-control and price balance in sharing data. Our model is equally a closed-loop control system that acts as a regulator of supervising its participants. The analyses and experimental results show, comparing with the existing sharing systems, that our model is practical and can spur agents to participate actively in sharing data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.