2022
DOI: 10.1109/tpds.2021.3104240
|View full text |Cite
|
Sign up to set email alerts
|

Evaluating Spatial Accelerator Architectures with Tiled Matrix-Matrix Multiplication

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 20 publications
(2 citation statements)
references
References 31 publications
0
2
0
Order By: Relevance
“…However, typical NN layouts used for benchmarking purposes, such as ResNet152 and AlexNet, 39 require a total number of 25 and 62 million trainable parameters, respectively, that can hardly fit as hardware-coded information even into the available number of computational elements supported by current top-class GPU and TPU platforms. This has turned tiled matrix multiplication (TMM) into the mainstream processing paradigm in today’s AI engines, 40 , 41 where both the input and the weighting values have to be updated at line rate through time division multiplexing (TDM) approaches until all matrix tiles are processed. To this end, the upgrade of neuromorphic photonics into a versatile AI processing platform has to proceed along the paradigm of today’s TPU and GPU computational engines, where a limited amount of hardware resources can execute DNNs with significantly higher dimensions.…”
Section: Introductionmentioning
confidence: 99%
“…However, typical NN layouts used for benchmarking purposes, such as ResNet152 and AlexNet, 39 require a total number of 25 and 62 million trainable parameters, respectively, that can hardly fit as hardware-coded information even into the available number of computational elements supported by current top-class GPU and TPU platforms. This has turned tiled matrix multiplication (TMM) into the mainstream processing paradigm in today’s AI engines, 40 , 41 where both the input and the weighting values have to be updated at line rate through time division multiplexing (TDM) approaches until all matrix tiles are processed. To this end, the upgrade of neuromorphic photonics into a versatile AI processing platform has to proceed along the paradigm of today’s TPU and GPU computational engines, where a limited amount of hardware resources can execute DNNs with significantly higher dimensions.…”
Section: Introductionmentioning
confidence: 99%
“…Nevertheless, the nature of such a thorough investigation does not align directly with the objectives of the current study. Cutting edge research currently attempts to optimise the architecture and performance of Machine Learning algorithms by exploring solutions ranging from the use of genetic algorithms [11] and dispersing the computational and data processing load to peripheral computers (the Edge) [12], to developing state of the art spatial accelerators to expedite big data processing [13]. Despite the utmost significance of these advancements, Appl.…”
Section: Introduction 1purpose and Innovation Of This Workmentioning
confidence: 99%