2016
DOI: 10.3390/computers5040020
|View full text |Cite
|
Sign up to set email alerts
|

Array Multipliers for High Throughput in Xilinx FPGAs with 6-Input LUTs

Abstract: Abstract:Multiplication is the dominant operation for many applications implemented on field-programmable gate arrays (FPGAs). Although most current FPGA families have embedded hard multipliers, soft multipliers using lookup tables (LUTs) in the logic fabric remain important. This paper presents a novel two-operand addition circuit (patent pending) that combines radix-4 partial-product generation with addition and shows how it can be used to implement two's-complement array multipliers. The circuit is specific… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
14
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 43 publications
(14 citation statements)
references
References 29 publications
0
14
0
Order By: Relevance
“…In this case, the generate and propagate signals are functions of the five variables (w 0 , w 1 , x i , a i , 2a i ), and thus can be implemented with the same number of LUTs as with the signed 2-bit independent weights described before. However, in the case of the multiplication by the unsigned 2-bit weights {0, 1, 2, 3}, the generate and propagate signals are functions of the six variables w 0 , w 1 , x i , a i , 2a i , 3a i , since the 3× multiple is also needed, and cannot be implemented with a single LUT, However, using a modified Booth recoding algorithm [38] and the implementation method proposed in [39], it is possible to avoid the 3× multiple, and implement the addition of a variable using a 5-variable function with a single level of LUTs.…”
Section: A Hybrid Core For 8-bit Activations and 8/2-bit Weights -C8:82mentioning
confidence: 99%
“…In this case, the generate and propagate signals are functions of the five variables (w 0 , w 1 , x i , a i , 2a i ), and thus can be implemented with the same number of LUTs as with the signed 2-bit independent weights described before. However, in the case of the multiplication by the unsigned 2-bit weights {0, 1, 2, 3}, the generate and propagate signals are functions of the six variables w 0 , w 1 , x i , a i , 2a i , 3a i , since the 3× multiple is also needed, and cannot be implemented with a single LUT, However, using a modified Booth recoding algorithm [38] and the implementation method proposed in [39], it is possible to avoid the 3× multiple, and implement the addition of a variable using a 5-variable function with a single level of LUTs.…”
Section: A Hybrid Core For 8-bit Activations and 8/2-bit Weights -C8:82mentioning
confidence: 99%
“…In [12], a multiplexer-based 8-bit multiplier is presented with 50 MHz frequency, whereas the proposed architecture achieves 320 MHz frequency for 16-bit multiplication. E. George Walters III presents array multipliers using six-input LUTs and shift register LUTs [13], whereas the research presented in this article presents those using four-input LUTs. The modern FPGAs have builtin multipliers in them but still the configurable multipliers using LUTs play a vital role in many applications due to their flexible size, placement and modification ability [13].…”
Section: Introductionmentioning
confidence: 99%
“…E. George Walters III presents array multipliers using six-input LUTs and shift register LUTs [13], whereas the research presented in this article presents those using four-input LUTs. The modern FPGAs have builtin multipliers in them but still the configurable multipliers using LUTs play a vital role in many applications due to their flexible size, placement and modification ability [13]. Many researchers have worked on the design of multipliers earlier, as reported in this section, but they have not explored the option of reusing the same resources using iterative methods.…”
Section: Introductionmentioning
confidence: 99%
“…For the i th column of the adder, x i and y i are the bits of X and Y, respectively, c i is the carry-in bit, c i+1 is the carry-out bit and s i is the sum bit. The prop i signal must be set to x i ⊕ y 1 and the gen i signal can be set to either x i or y i to add x i and y i [14,16]. If x i and y i together are a function of five or fewer inputs, then the LUT6 can be configured as two LUT5s, generating either x i or y i at O5 and routing it to gen i , and generating x i ⊕ y i at O6 to drive prop i .…”
Section: Proposed Two-operand Addermentioning
confidence: 99%
“…This paper describes an approach that uses a novel two-operand addition circuit [14][15][16] that combines generation of a pre-computed partial product with addition of another value, similar to Wirthlin's work but optimized for Xilinx FPGAs with 6-input LUTs. A novel approach is used for the case where the constant is negative.…”
Section: Introductionmentioning
confidence: 99%