Counter and compressor arrays are frequently employed in multiplier design to efficiently reduce partial products in VLSI design. On the other hand, in reconfigurable systems, fast carry chains boost the performance of carrypropagate adders. So that, in reconfigurable systems, to save logic element area, counter and compressor trees are not employed as much since they require more area than carrypropagate scheme. In this work, carry-propagate multioperand adders are employed in smaller blocks and the outputs are merged using double carry-save encoding to increase performance in reconfigurable systems. Hence, a more compact structure is achieved, compared to full redundant partial product reduction scheme providing comparable speed performance with counter array based carry-save structure. To show the effectiveness of the implementation, fused multiplyaccumulate (MAC) units are designed for various bit-widths. The structure is implemented on Altera TM Stratix III and Cyclone III FPGAs and the results show that, using least depth of pipeline, the throughput is better than regular carrypropagate and fully redundant carry-save reduction schemes.