2007 5th IEEE/ACM International Conference on Formal Methods and Models for Codesign (MEMOCODE 2007) 2007
DOI: 10.1109/memcod.2007.371239
|View full text |Cite
|
Sign up to set email alerts
|

Hardware Acceleration of Matrix Multiplication on a Xilinx FPGA

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
15
0

Year Published

2008
2008
2021
2021

Publication Types

Select...
6
3

Relationship

1
8

Authors

Journals

citations
Cited by 35 publications
(15 citation statements)
references
References 2 publications
0
15
0
Order By: Relevance
“…Then each kernel is expanded consecutively forming the filter matrix F m . As shown in the figure above, many pixels that are included in overlapping kernels will be duplicated in the matrix I m which seems inefficient, though large matrix to matrix multiplications are highly optimizable and parallelizable [6]. This linear algebra computation in Caffe is done using the GEMM function which is highly-tuned in BLAS libraries.…”
Section: A Caffe's Convolution With Gemmmentioning
confidence: 99%
“…Then each kernel is expanded consecutively forming the filter matrix F m . As shown in the figure above, many pixels that are included in overlapping kernels will be duplicated in the matrix I m which seems inefficient, though large matrix to matrix multiplications are highly optimizable and parallelizable [6]. This linear algebra computation in Caffe is done using the GEMM function which is highly-tuned in BLAS libraries.…”
Section: A Caffe's Convolution With Gemmmentioning
confidence: 99%
“…The dense matrix multiplication design proposed by Kumar et al, which is based on the rank one update algorithm, heavily inspired the design of the solver engine [15]. The largest fraction of work has focused specifically on matrix multiplication on FPGAs [16,17,18,19,20,21]. Other works have also considered matrix-vector multiplication and dot products [22] on FPGAS, or matrix multiplication on GPUs [23].…”
Section: Acceleratorsmentioning
confidence: 99%
“…Matrix multiplication is a widely used operation of linear algebra, and therefore many interesting reconfigurable designs have been proposed, each of them satisfying specific requirements : [2] and [3] try to reduce the silicon complexity of the reconfigurable device , while [4] focuses on 64-bit floating point elements, [5] only on squared matrices with double precision floating point elements, [6] on multiplying small matrices, and [7] on improving the silicon cost and energy efficiency of the design. The only designs which are optimized for fixed-point matrix multiplication are: (a) the one by Dave et.…”
Section: Comparison With Existing Reconfigurable Systemsmentioning
confidence: 99%
“…Al. [5], which was proposed at 2007 MEMOCODE's hardware/ software co-design contest, and (b) the subsystem implemented by Zicari et al [10]. Both designs support only square matrices and they split the processing tasks into the reconfigurable hardware resources and the built-in PowerPC of a Xilinx Virtex II Pro device.…”
Section: Comparison With Existing Reconfigurable Systemsmentioning
confidence: 99%