2015 IEEE 39th Annual Computer Software and Applications Conference 2015
DOI: 10.1109/compsac.2015.345
|View full text |Cite
|
Sign up to set email alerts
|

Provenance Research Issues and Challenges in the Big Data Era

Abstract: Provenance of Big Data is a hot-topic in the database and data mining research communities. Basically, provenance is the process of detecting the lineage and the derivation of data and data objects, and it plays a major role in database management systems as well as in workflow management systems and distributed systems. Despite this, provenance of big data research is still in its embryonic phase, and a lot of efforts must still be done in this area. Inspired by these considerations, in this paper we provide … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0
1

Year Published

2017
2017
2023
2023

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 10 publications
(8 citation statements)
references
References 24 publications
0
7
0
1
Order By: Relevance
“…综上所述, 当前大数据世系的研究虽然取得了一些成果, 但仍有很多问题有待进一步研究, 具体体 现在 [290,316,317] : (1) 大数据的异构特性导致对世系数据进行统一建模和结构化描述更加困难; 不同世 系系统产生的世系数据间的交互融合面临多种世系格式的兼容性问题; (2) 大数据的海量特性使得世 系数据的采集、传输和存储加重了大数据平台在计算、通信和存储方面的负担, 一些研究人员提出采 用世系压缩的方法降低存储和传输开销, 但数据的压缩和解压也带来新的计算开销 [288] ; (3) 大数据平 台的封装和透明化, 增加了世系数据采集的实现难度, 尤其对流数据的世系采集、存储等研究仍处于 起步阶段, 最新研究成果较少; (4) 世系数据的安全已引起研究人员关注, 但所提方法在大数据场景下 的实用性有待进一步验证和提升; (5) 目前基于世系的数据安全监管研究大多只分析了世系数据用于 数据监管的可行性并通过案例进行说明, 但缺乏基于世系检测数据泄露等安全威胁的自动化方法.…”
Section: 在世系数据的采集、存储、融合和查询中 世系采集是研究人员的主要关注 在大数据场景下 世 系数据本身也是一种大数据unclassified
“…综上所述, 当前大数据世系的研究虽然取得了一些成果, 但仍有很多问题有待进一步研究, 具体体 现在 [290,316,317] : (1) 大数据的异构特性导致对世系数据进行统一建模和结构化描述更加困难; 不同世 系系统产生的世系数据间的交互融合面临多种世系格式的兼容性问题; (2) 大数据的海量特性使得世 系数据的采集、传输和存储加重了大数据平台在计算、通信和存储方面的负担, 一些研究人员提出采 用世系压缩的方法降低存储和传输开销, 但数据的压缩和解压也带来新的计算开销 [288] ; (3) 大数据平 台的封装和透明化, 增加了世系数据采集的实现难度, 尤其对流数据的世系采集、存储等研究仍处于 起步阶段, 最新研究成果较少; (4) 世系数据的安全已引起研究人员关注, 但所提方法在大数据场景下 的实用性有待进一步验证和提升; (5) 目前基于世系的数据安全监管研究大多只分析了世系数据用于 数据监管的可行性并通过案例进行说明, 但缺乏基于世系检测数据泄露等安全威胁的自动化方法.…”
Section: 在世系数据的采集、存储、融合和查询中 世系采集是研究人员的主要关注 在大数据场景下 世 系数据本身也是一种大数据unclassified
“…Data provenance is a well-known research area within database and data mining. It considers the problem of identifying the origin, the creation, as well as the propagation processes of data [4]. It may be defined as the process of detecting the lineage and the derivation of data and data objects [5].…”
Section: Background and Related Work On Big-data Provenancementioning
confidence: 99%
“…Data provenance is an essential component in a various areas such as: database management systems, workflow management systems, distributed systems, and debugging ICT systems [4]. In the case of security violations, a system administrator should be able to identify the origination of the error, in addition to its causes and impacts [6].…”
Section: Background and Related Work On Big-data Provenancementioning
confidence: 99%
“…Provenance management fundamentals for electronic data can be applied to gain better control over data quality [42]. However, provenance management for big data poses several challenges [44], which will also have to be mitigated.…”
Section: Metadatamentioning
confidence: 99%