2020 International Conference on Field-Programmable Technology (ICFPT) 2020
DOI: 10.1109/icfpt51103.2020.00048
|View full text |Cite
|
Sign up to set email alerts
|

Battling the CPU Bottleneck in Apache Parquet to Arrow Conversion Using FPGA

Abstract: In the domain of big data analytics, the bottleneck of converting storage-focused file formats to in-memory data structures has shifted from the bandwidth of storage to the performance of decoding and decompression software. Two widely used formats for big data storage and in-memory data are Apache Parquet and Apache Arrow, respectively. In order to improve the speed at which data can be loaded from disk to memory, we propose an FPGA accelerator design that converts Parquet files to Arrow in-memory data struct… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2

Relationship

2
4

Authors

Journals

citations
Cited by 7 publications
(1 citation statement)
references
References 11 publications
0
1
0
Order By: Relevance
“…OpenCAPI is also used to accelerate JSON parsing for big data applications [24]. Peltenburg et al [33] propose an FPGA accelerator with OpenCAPI to improve the speed at which data can be loaded from disk to memory. Hoozemans et al [25] explore the benefits of OpenCAPI for FPGA-accelerated big data systems.…”
Section: Related Workmentioning
confidence: 99%
“…OpenCAPI is also used to accelerate JSON parsing for big data applications [24]. Peltenburg et al [33] propose an FPGA accelerator with OpenCAPI to improve the speed at which data can be loaded from disk to memory. Hoozemans et al [25] explore the benefits of OpenCAPI for FPGA-accelerated big data systems.…”
Section: Related Workmentioning
confidence: 99%