2021
DOI: 10.1088/1742-6596/1727/1/012005
|View full text |Cite
|
Sign up to set email alerts
|

Comparative Characteristics of Big Data Storage Formats

Abstract: One of the most important tasks of any platform for big data processing is the task of the storing data received. Different systems have different requirements for the storage formats of big data, which raises the problem of choosing the optimal data storage format to solve the current problem. This paper describes the five most popular formats for storing big data, presents an experimental evaluation of these formats and a methodology for choosing the format.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
2

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(3 citation statements)
references
References 6 publications
0
3
0
Order By: Relevance
“…Delta Lake is a storage layer for improving the reliability of data lakes [36][37][38]. Delta Lake can operate on the basis of implemented data lakes using Apache Hadoop [11], Amazon S3 [32] or Azure Data Lake Storage [39].…”
Section: Delta Lakementioning
confidence: 99%
See 1 more Smart Citation
“…Delta Lake is a storage layer for improving the reliability of data lakes [36][37][38]. Delta Lake can operate on the basis of implemented data lakes using Apache Hadoop [11], Amazon S3 [32] or Azure Data Lake Storage [39].…”
Section: Delta Lakementioning
confidence: 99%
“…This file system is cheaper for use than commercial data bases. Using such data warehouse, choosing the right file format is critical [11]. File format determines how information would be stored in HDFS.…”
Section: Introductionmentioning
confidence: 99%
“…Current storage media is struggling to keep pace with this growth rate and lacks sufficient storage capacity, presenting significant challenges to current data centers and storage techniques. To address this issue, there has been a surge in research and development of Big Data platforms and systems, resulting in the emergence of various software products, tools, and database systems to meet the escalating demand for large-scale data storage and processing [1]- [4].…”
Section: Introductionmentioning
confidence: 99%