Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Syst 2019
DOI: 10.1145/3297858.3304008
|View full text |Cite
|
Sign up to set email alerts
|

Scalable Processing of Contemporary Semi-Structured Data on Commodity Parallel Processors - A Compilation-based Approach

Abstract: JSON (JavaScript Object Notation) and its derivatives are essential in the modern computing infrastructure. However, existing software often fails to process such types of data in a scalable way, mainly for two reasons: (i) the processing often requires to build a memory-consuming parse tree; (ii) there exist inherent dependences in processing the data stream, preventing any data-level parallelization. Facing the challenges, developers often have to construct ad-hoc pre-parsers to split the data stream in orde… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
16
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
3

Relationship

2
4

Authors

Journals

citations
Cited by 8 publications
(16 citation statements)
references
References 28 publications
0
16
0
Order By: Relevance
“…In addition, there are also proposals of programming language constructs [35], speculation design for irregular applications [34], as well as compilers with speculative execution supports [38]. Some recent works also target speculative parallelization of semi-structured data stream processing, but at the byte level rather than bit level, like speculative HTML parsing [51] and speculative path query processing of XML/JSON streams [19,20].…”
Section: Related Workmentioning
confidence: 99%
“…In addition, there are also proposals of programming language constructs [35], speculation design for irregular applications [34], as well as compilers with speculative execution supports [38]. Some recent works also target speculative parallelization of semi-structured data stream processing, but at the byte level rather than bit level, like speculative HTML parsing [51] and speculative path query processing of XML/JSON streams [19,20].…”
Section: Related Workmentioning
confidence: 99%
“…XML streaming [10], [14], [17] is an instance on XML processing. In particular, it has recently been actively investigated for JSON [9], [15], [18], [19], [25]. Although these studies dealt with JSON parsing, their problem settings differed depending on concerned data models and queries.…”
Section: Related Work 61 Json and Xml Processingmentioning
confidence: 99%
“…Grammar-based enhancement of parallel processing [9], [10] aimed to compile queries with grammatical constraints. Our aim with Centaurus is grammar-based programming.…”
Section: Related Work 61 Json and Xml Processingmentioning
confidence: 99%
“…To demonstrate its efficiency, we compared JSONSki with several existing JSON tools using real-world datasets and standard path queries. The results have shown that JSONSki outperforms JPStream [35], a state-of-the-art streaming library, and simdjson [40], a popular SIMD-based parser substantially, in both the large and small record processing scenarios. • First, it systematically reveals the fast-forward opportunities in the semi-structured data streaming (Section 3).…”
Section: Introductionmentioning
confidence: 97%
“…• Streaming scheme can naturally avoid the above issues by immediately consuming the semi-structured data stream, locating and extracting the data of interests on the fly. For example, JsonSurfer [13] and JPStream [35] evaluate path queries dynamically as they traverse the data stream without generating any in-memory trees. Thus, they only need to scan the data in one pass with a small memory footprint.…”
Section: Introductionmentioning
confidence: 99%