Proceedings of the 21st Annual Symposium on Integrated Circuits and System Design 2008
DOI: 10.1145/1404371.1404392
|View full text |Cite
|
Sign up to set email alerts
|

Implementation of a double-precision multiplier accumulator with exception treatment to a dense matrix multiplier module in FPGA

Abstract: Recently, the manufactures of supercomputers have made use of FPGAs to accelerate scientific applications [16] [17]. Traditionally, the FPGAs were used only on non-scientific applications. The main reasons for this fact are: the floating-point computation complexity; the FPGA logic cells are not sufficient for the scientific cores implementation; the cores complexity prevents them to operate on high frequencies.Nowadays, the increase of specialized blocks availability in complex operations, as sum and multipli… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
3
0
2

Year Published

2009
2009
2011
2011

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 6 publications
0
3
0
2
Order By: Relevance
“…The MAC has a pipeline with 33 stages. Since the data reuse strategy proposed in [4] substantially reduced the data access bottleneck, the number of MACs that can be instantiated in the FPGA is limited by the number of DSP blocks in the FPGA [5].…”
Section: B the Architecturementioning
confidence: 99%
“…The MAC has a pipeline with 33 stages. Since the data reuse strategy proposed in [4] substantially reduced the data access bottleneck, the number of MACs that can be instantiated in the FPGA is limited by the number of DSP blocks in the FPGA [5].…”
Section: B the Architecturementioning
confidence: 99%
“…Hence, the order of matrices operated have to meet the condition The second consideration is the compromise between the number of MACs (which is limited by the amount of DSPs blocks in the FPGA) [17] and the memory bandwidth available in the architecture, given by…”
Section: Data Reuse Exploitation Strategymentioning
confidence: 99%
“…• the processing block is similar to the one presented in Figure 4 and uses accumulative multipliers (MACs) in double precision floating-point, according to IEEE-754 standard [17]. The Figure 6 shows a block diagram of the developed architecture.…”
Section: Case Study -Processing Architecturementioning
confidence: 99%
“…• O bloco de processamento é similar ao apresentado na Figura 5 e utiliza multiplicadores acumuladores (MACs) de ponto-flutuante precisão dupla, de acordo com o padrão IEEE-754 [18].…”
Section: Estudo De Caso -Arquitetura De Processamentounclassified
“…é o número de palavras que podem ser armazenadas nas BRAMs; A segunda consideração é o compromisso que deve existir entre o número de MACs (que é limitada pela quantidade de blocos de DSP)[18] e a largura de banda disponível na arquiteturaonde bw é a largura de banda da memória em bits por segundo, k é o número de MACs, , f é a freqüênciade operação do FPGA e DSP MAC N _ é o número máximo de MACs que podem ser instanciados no FPGA usando os DSPs disponíveis.…”
unclassified