Accelerating raw data analysis with the ACCORDA software and hardware architecture

You, Fang; Zou, Chen; Chien, Andrew A.

doi:10.14778/3342263.3342634

Cited by 15 publications

(5 citation statements)

References 52 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, their presented PUs can't process one character per cycle and must be replicated extensively to achieve a high throughput, hence requiring a lot of resources. ACCORDA [3] tries to improve the processing of raw unstructured data with dedicated Hardware Accelerators. The authors show that their unstructured data processor can parse and filter JSON data for all common predicates, but is in return again very resource intensive.…”

Section: Related Workmentioning

confidence: 99%

Raw Filtering of JSON Data on FPGAs

Hahn¹,

Becher²,

Wildermann³

et al. 2022

Preprint

View full text Add to dashboard Cite

Many Big Data applications include the processing of data streams on semi-structured data formats such as JSON. A disadvantage of such formats is that an application may spend a significant amount of processing time just on unselectively parsing all data. To relax this issue, the concept of raw filtering is proposed with the idea to remove data from a stream prior to the costly parsing stage. However, as accurate filtering of raw data is often only possible after the data has been parsed, raw filters are designed to be approximate in the sense of allowing false-positives in order to be implemented efficiently.Contrary to previously proposed CPU-based raw filtering techniques that are restricted to string matching, we present FPGA-based primitives for filtering strings, numbers and also number ranges. In addition, a primitive respecting the basic structure of JSON data is proposed that can be used to further increase the accuracy of introduced raw filters.The proposed raw filter primitives are designed to allow for their composition according to a given filter expression of a query. Thus, complex raw filters can be created for FPGAs which enable a drastical decrease in the amount of generated false-positives, particularly for IoT workload.As there exists a trade-off between accuracy and resource consumption, we evaluate primitives as well as composed raw filters using different queries from the RiotBench benchmark. Our results show that up to 94.3% of the raw data can be filtered without producing any observed false-positives using only a few hundred LUTs.

show abstract

Section: Related Workmentioning

confidence: 99%

Raw Filtering of JSON Data on FPGAs

Hahn¹,

Becher²,

Wildermann³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Similarly, the ACCORDA [16] proposal for integrating a specialized SQL operator engine into the memory hierarchy, and the Oracle SPARC M7 "Software in Silicon" accelerator, operate on the stream of data between DRAM and processor cache, while ACCI connects a programmable accelerator for arbitrary tasks to the coherent interconnect. Nevertheless, there are similarities which show the value of the functionality all these systems provide.…”

Section: Related Workmentioning

confidence: 99%

“…This effectively turns the FPGA into a smart memory management unit, treating FPGA memory as memory in a different NUMA node and returning results directly into the L2 cache of the requesting core, much as a read or write operation over conventional memory would do. This use case offers a nice contrast to existing work implementing similar functionality in restricted settings such as the garbage collection accelerator implemented as part of a RISC-V processor architecture [30] or ACCORDA, a near-memory accelerator prototyped on an FPGA and intended to be inserted on the path between caches and CPUs to offload SQL data processing [16]. For reasons of space, we leave other, more complex use cases that involve manipulating coherency for future work.…”

Section: Introductionmentioning

confidence: 99%

ECI: a Customizable Cache Coherency Stack for Hybrid FPGA-CPU Architectures

Ramdas¹,

Giardino²,

Shi³

et al. 2022

Preprint

View full text Add to dashboard Cite

Unlike other accelerators, FPGAs are capable of supporting cache coherency, thereby turning them into a more powerful architectural option than just a peripheral accelerator. However, most existing deployments of FPGAs are either non-cache coherent or support only an asymmetric design where cache coherency is controlled from the CPU. Taking advantage of a recently released two socket CPU-FPGA architecture, in this paper we describe A Customizable Caching Interface (ACCI), a flexible implementation of cache coherency on the FPGA capable of supporting both symmetric and asymmetric protocols. ACCI is open and customizable, given applications the opportunity to fully interact with the cache coherency protocol, thereby opening up many interesting system design and research opportunities not available in existing designs. Through extensive microbenchmarks we show that ACCI exhibits highly competitive performance and discuss in detail one use-case illustrating the benefits of having an open cache coherency stack on the FPGA.

show abstract

“…Although, to our knowledge not yet used with disaggregated memory, the idea mirrors a growing trends to push SQL operators near the data, until now mostly to storage [43,72]. Even more ambitious are accelerators embedded in the data path between memory and CPU caches [29,35], which can filter data as it is read from memory to reduce data movement and cache pollution. Finally, in the cloud, systems like Amazon's AQUA [21] use SSDs attached to FPGAs to implement a caching layer for RedShift that supports SQL filtering operations and operator push-down to minimize the amount of data movement from storage to the processing nodes.…”

Section: Efficient Data Movementmentioning

confidence: 99%

“…Such designs do not address the overhead of moving large data sets to the CPU, only to have most of it filtered or projected out. Specialized hardware between memory and the CPU has even been proposed to filter data as early as possible, minimizing bus congestion and cache pollution [12,35].…”

Section: Introductionmentioning

confidence: 99%

Farview: Disaggregated Memory with Operator Off-loading for Database Engines

Korolija¹,

Koutsoukos²,

Keeton³

et al. 2021

Preprint

View full text Add to dashboard Cite

Cloud deployments disaggregate storage from compute, providing more flexibility to both the storage and compute layers. In this paper, we explore disaggregation by taking it one step further and applying it to memory (DRAM). Disaggregated memory uses network attached DRAM as a way to decouple memory from CPU. In the context of databases, such a design offers significant advantages in terms of making a larger memory capacity available as a central pool to a collection of smaller processing nodes. To explore these possibilities, we have implemented Farview, a disaggregated memory solution for databases, operating as a remote buffer cache with operator offloading capabilities. Farview is implemented as an FPGA-based smart NIC making DRAM available as a disaggregated, network attached memory module capable of performing data processing at line rate over data streams to/from disaggregated memory. Farview supports query offloading using operators such as selection, projection, aggregation, regular expression matching and encryption. In this paper we focus on analytical queries and demonstrate the viability of the idea through an extensive experimental evaluation of Farview under different workloads. Farview is competitive with a local buffer cache solution for all the workloads and outperforms it in a number of cases, proving that a smart disaggregated memory can be a viable alternative for databases deployed in cloud environments.

show abstract

Accelerating raw data analysis with the ACCORDA software and hardware architecture

Cited by 15 publications

References 52 publications

Raw Filtering of JSON Data on FPGAs

Raw Filtering of JSON Data on FPGAs

ECI: a Customizable Cache Coherency Stack for Hybrid FPGA-CPU Architectures

Farview: Disaggregated Memory with Operator Off-loading for Database Engines

Contact Info

Product

Resources

About