The key to a data parallel compiler

Hsu, Aaron W.

doi:10.1145/2935323.2935331

Cited by 4 publications

(1 citation statement)

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Another approach, specific to our application, is refactoring the data structures. Here, the binary decision tree could be transformed into an indexed path matrix [24]. Such a parallel data structure would allow the GPU to also traverse the tree in parallel, avoiding the sequential while loop within the random forest tree traversal that currently limits GPUs performance in our use case.…”

Section: Possible Optimizations and Discussionmentioning

confidence: 99%

Unleashing GPUs for Network Function Virtualization: an open architecture based on Vulkan and Kubernetes

Haavisto

Cholez

Riekki

2022

NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium

View full text Add to dashboard Cite

General-purpose computing on graphics processing units (GPGPU) is a promising way to speed up computationally intensive network functions, such as performing traffic classification based on machine learning at line speed. Recent studies have focused on integrated graphics units and various performance optimizations to address bottlenecks such as latency. However, these approaches tend to produce architecturespecific binaries and lack the orchestration of functions. A complementary effort would be a GPGPU architecture based on standard and open components, which allows the creation of interoperable and orchestrable network functions.This study describes and evaluates such open architecture based on the cross-platform Vulkan API, in which we execute hand-written SPIR-V code as a network function. We also demonstrate a multi-node orchestration approach for our proposed architecture using Kubernetes. We validate our architecture by executing SPIR-V code performing traffic classification with random forest inference. We test this application both on discrete and integrated graphics cards and on x86 and ARM. We find that in all cases the GPUs are faster than the Cython code.

show abstract

Section: Possible Optimizations and Discussionmentioning

confidence: 99%