AI-Driven Performance Modeling for AI Inference Workloads

Sponner, Max; Waschneck, Bernd; Kumar, Akash

doi:10.3390/electronics11152316

Cited by 4 publications

(2 citation statements)

References 44 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To solve the above problems, some researchers came up with a kernel additive method; they predict each kernel operation, such as convolution, dense, and LSTM, individually and sum up all kernel values to predict the overall performance of the DL model [9,16,19,21,23,25]. Yu et al [24] used the wave-scaling technique to predict the inference latency of the DL model on GPU, but this technique requires access to a GPU in order to make the prediction.…”

Section: Related Workmentioning

confidence: 99%

DIPPM: A Deep Learning Inference Performance Predictive Model Using Graph Neural Networks

Panner Selvam,

Brorsson

2023

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Deep Learning (DL) has developed to become a corner-stone in many everyday applications that we are now relying on. However, making sure that the DL model uses the underlying hardware efficiently takes a lot of effort. Knowledge about inference characteristics can help to find the right match so that enough resources are given to the model, but not too much. We have developed a DL Inference Performance Predictive Model (DIPPM) that predicts the inference latency, energy, and memory usage of a given input DL model on the NVIDIA A100 GPU. We also devised an algorithm to suggest the appropriate A100 Multi-Instance GPU profile from the output of DIPPM. We developed a methodology to convert DL models expressed in multiple frameworks to a generalized graph structure that is used in DIPPM. It means DIPPM can parse input DL models from various frameworks. Our DIPPM can be used not only helps to find suitable hardware configurations but also helps to perform rapid design-space exploration for the inference performance of a model. We constructed a graph multi-regression dataset consisting of 10,508 different DL models to train and evaluate the performance of DIPPM, and reached a resulting Mean Absolute Percentage Error (MAPE) as low as 1.9%.

show abstract

Section: Related Workmentioning

confidence: 99%

DIPPM: A Deep Learning Inference Performance Predictive Model Using Graph Neural Networks

Panner Selvam,

Brorsson

2023

Lecture Notes in Computer Science

View full text Add to dashboard Cite

show abstract

“…None of these tools are specifically made for compiling code to microcontroller targets, but many of them support microcontroller chips. Sponner et al have done a good review and a benchmark about these tools targeting embedded platforms in [155]. From these tools the TVM project is probably the most interesting.…”

Section: Edge Ai Software For Microcontrollersmentioning

confidence: 99%

Artificial Intelligence in the IoT Era: A Review of Edge AI Hardware and Software

Sipola

Alatalo

Kokkonen

et al. 2022

2022 31st Conference of Open Innovations Association (FRUCT)

View full text Add to dashboard Cite

The modern trend of moving artificial intelligence computation near to the origin of data sources has increased the demand for new hardware and software suitable for such environments. We carried out a scoping study to find the current resources used when developing Edge AI applications. Due to the nature of the topic, the research combined scientific sources with product information and software project sources. The paper is structured as follows. In the first part, Edge AI applications are briefly discussed followed by hardware options and finally, the software used to develop AI models is described. There are various hardware products available, and we found as many as possible for this research to identify the best-known manufacturers. We describe the devices in the following categories: artificial intelligence accelerators and processors, field-programmable gate arrays, system-on-a-chip devices, system-on-modules, and full computers from development boards to servers. There seem to be three trends in Edge AI software development: neural network optimization, mobile device software and microcontroller software. We discussed these emerging fields and how the special challenges of low power consumption and machine learning computation are being taken into account. Our findings suggest that the Edge AI ecosystem is currently developing, and it has its own challenges to which vendors and developers are responding.

show abstract

Role of Artificial Intelligence and Internet of Things in Neurodegenerative Diseases

Mathur,

Bhattacharjee,

Sehgal

et al. 2024

Studies in Computational Intelligence

View full text Add to dashboard Cite

AI-Driven Performance Modeling for AI Inference Workloads

Cited by 4 publications

References 44 publications

DIPPM: A Deep Learning Inference Performance Predictive Model Using Graph Neural Networks

DIPPM: A Deep Learning Inference Performance Predictive Model Using Graph Neural Networks

Artificial Intelligence in the IoT Era: A Review of Edge AI Hardware and Software

Role of Artificial Intelligence and Internet of Things in Neurodegenerative Diseases

Contact Info

Product

Resources

About