The amount of data generated in today’s world has a fair share of personal information about individuals that helps data owners and data processors in providing them with personalized services. Different legal and regulatory obligations apply to all data owners collecting personal information, specifying they use it only for the agreed-upon purposes and in a transparent way to preserve privacy. However, it is difficult to achieve this in large-scale and distributed infrastructures as data is continuously changing its form, such as through aggregation with other sources or the generation of new transformed resources, resulting often in the loss or misinterpretation of the collection purpose. In order to preserve the authorized collection purposes, we propose data is added as a part of immutable and append-only resource metadata (provenance), to be retrieved by an access control mechanism when required for data-usage verification. This not only ensures purpose limitation in large-scale infrastructures but also provides transparency for individuals and auditing authorities to track how personal information is used.