Q100

Wu, Lisa; Lottarini, Andrea; Paine, Timothy K.; Kim, Martha A.; Ross, Kenneth A.

doi:10.1145/2644865.2541961

Cited by 8 publications

(2 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…application-specific integrated circuits (ASICs). Google TPUs, 1 AWS Inferentia, 2 data processing units, 3 Intel quick sync video, 4 Xilinx FPGA video encoder, 5 along with various models of Nvidia GPU 6 are just a few popular mentions in this drastic shift.…”

Section: Introductionmentioning

confidence: 99%

“…application‐specific integrated circuits (ASICs). Google TPUs, 1 AWS Inferentia, 2 data processing units, 3 Intel quick sync video, 4 Xilinx FPGA video encoder, 5 along with various models of Nvidia GPU 6 are just a few popular mentions in this drastic shift. It is anticipated that the datacenters with such ASICs machines will be the cornerstone of next generation cloud computing systems.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

SMSE: A serverless platform for multimedia cloud systems

Denninnart,

Amini Salehi

2023

Concurrency and Computation

View full text Add to dashboard Cite

SummaryAlong with the rise of domain‐specific computing (ASICs hardware) and domain‐specific programming languages, we envision that the next step is the emergence of domain‐specific cloud platforms. Considering multimedia streaming as one of the most trendy applications in the IT industry, the goal of this study is to develop serverless multimedia streaming engine (SMSE), the first domain‐specific serverless platform for multimedia streaming. SMSE democratizes multimedia service development via enabling content providers (or even end‐users) to rapidly develop their desired functionalities on their multimedia contents. Upon developing SMSE, the next goal of this study is to deal with its efficiency challenges and develop a function container provisioning method that can efficiently utilize cloud resources and improve the users' quality of service. In particular, we develop a dynamic method that provisions durable or ephemeral containers depending on the spatiotemporal and data‐dependency characteristics of the functions. Evaluating the prototype implementation of SMSE under real‐world settings demonstrates its capability to reduce both the containerization overhead, and the makespan time of serving multimedia processing functions (by up to 30%) in compare to the function provision methods that are being used in the general‐purpose serverless cloud systems.

show abstract

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

SMSE: A serverless platform for multimedia cloud systems

Denninnart,

Amini Salehi

2023

Concurrency and Computation

View full text Add to dashboard Cite

show abstract

Accelerating Hash-Based Query Processing Operations on FPGAs by a Hash Table Caching Technique

Salami

Arcas-Abella

Sönmez

et al. 2017

Communications in Computer and Information Science

View full text Add to dashboard Cite

Abstract. Extracting valuable information from the rapidly growing field of Big Data faces serious performance constraints, especially in the softwarebased database management systems (DBMS). In a query processing system, hash-based computational primitives such as the hash join and the group-by are the most time-consuming operations, as they frequently need to access the hash table on the high-latency off-chip memories and also to traverse whole the table. Subsequently, the hash collision is an inherent issue related to the hash tables, which can adversely degrade the overall performance. In order to alleviate this problem, in this paper, we present a novel pure hardware-based hash engine, implemented on the FPGA. In order to mitigate the high memory access latencies and also to faster resolve the hash collisions, we follow a novel design point. It is based on caching the hash table entries in the fast on-chip Block-RAMs of FPGA. Faster accesses to the correspondent hash table entries from the cache can lead to an improved overall performance. We evaluated the proposed approach by running hash-based table join and group-by operations of 5 TPC-H benchmark queries. The results show 2.9X -4.4X speedups over the cache-less FPGA-based baseline.

show abstract