2022 IEEE 4th International Conference on Artificial Intelligence Circuits and Systems (AICAS) 2022
DOI: 10.1109/aicas54282.2022.9869996
|View full text |Cite
|
Sign up to set email alerts
|

Scale up your In-Memory Accelerator: Leveraging Wireless-on-Chip Communication for AIMC-based CNN Inference

Abstract: In-Memory Computing (AIMC) is emerging as a disruptive paradigm for heterogeneous computing, potentially delivering orders of magnitude better peak performance and efficiency over traditional digital signal processing architectures on Matrix-Vector multiplication. However, to sustain this throughput in real-world applications, AIMC tiles must be supplied with data at very high bandwidth and low latency; this poses an unprecedented pressure on the on-chip communication infrastructure, which becomes the system's… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
1
1
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 17 publications
0
2
0
Order By: Relevance
“…b.) Parallelized convolution [14]: This is a CNN-based inference workload in which the layers of the network and inputs are tiled and deployed on separate cores. This is a pure L2 to L1 (core) and L1 (core) to L2 memory traffic pattern and has no intercore communication.…”
Section: B Synthetic Trafficmentioning
confidence: 99%
See 1 more Smart Citation
“…b.) Parallelized convolution [14]: This is a CNN-based inference workload in which the layers of the network and inputs are tiled and deployed on separate cores. This is a pure L2 to L1 (core) and L1 (core) to L2 memory traffic pattern and has no intercore communication.…”
Section: B Synthetic Trafficmentioning
confidence: 99%
“…This is a pure L2 to L1 (core) and L1 (core) to L2 memory traffic pattern and has no intercore communication. c.) Pipelined convolution [14]: Depth-first or pipeline dataflow is used in many new DNN platforms to efficiently run CNN-based inference. In this scheme, layers are executed in parallel, in a pipelined way across the different cores to reduce the data traffic to higher memory levels [15].…”
Section: B Synthetic Trafficmentioning
confidence: 99%