2016
DOI: 10.4028/www.scientific.net/amm.855.153
|View full text |Cite
|
Sign up to set email alerts
|

A Comparison of ORC-Compress Performance with Big Data Workload on Virtualization

Abstract: Big Data is widely used in many organizations nowadays. Hive is an open source data warehouse system for managing large data set. It provides a SQL-like interface to Hadoop over Map-Reduce framework. Currently, Big Data solution starts to adopt HiveQL tool to improve execution time of relational information. In this paper, we investigate on an execution time of query processing issues comparing two algorithm of ORC file: ZLIB and SNAPPY. The results show that ZLIB can compress data up to 87% compared to NONE c… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
4
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
3
1
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 5 publications
0
4
0
Order By: Relevance
“…When it comes to multi-dimensional complex associations, the efficiency is very low. To address this issue, a Solr-based Hbase massive data secondary index scheme is proposed in [16]. Besides, a secondary index is established for Hbase [8], such that the query efficiency can be improved.…”
Section: Related Workmentioning
confidence: 99%
“…When it comes to multi-dimensional complex associations, the efficiency is very low. To address this issue, a Solr-based Hbase massive data secondary index scheme is proposed in [16]. Besides, a secondary index is established for Hbase [8], such that the query efficiency can be improved.…”
Section: Related Workmentioning
confidence: 99%
“…This has saved a lot of manpower and time costs for the enterprise. (b) The original data types are diverse and have different formats, such as CSV, TXT, JSON, Parquet, ORC, and other formats. , In addition, data sources are also diverse, including relational MySQL, Kafka, Hbase, and other data sources. DLI is fully compatible with the Spark ecosystem, naturally supporting multiple data sources and formats.…”
Section: Introductionmentioning
confidence: 99%
“…13 Facing the new unconventional gas reservoir type of deep coalbed methane, there is an urgent need to solve numerous basic theories and technical bottlenecks. 14 This is the only way to achieve gas storage and production increase and effective economic development. 15 periphery, empowering cooperation, expanding new energy (sources), and digital management."…”
Section: Introductionmentioning
confidence: 99%
“…The fracturing fluid system still cannot meet the requirements of low damage and fouling transformation, and further technology development is needed . Facing the new unconventional gas reservoir type of deep coalbed methane, there is an urgent need to solve numerous basic theories and technical bottlenecks . This is the only way to achieve gas storage and production increase and effective economic development .…”
Section: Introductionmentioning
confidence: 99%