In this paper, an improved multiplier architecture, utilizing dual mode logic (DML) targeting single-instruction-multiple-data (SIMD)-like systems is proposed. The design introduces improvements at both the architecture and logic gate levels, by capitalizing on their synergistic combination. At the architecture level, the multiplier design is adapted to accommodate diverse computations based on the level of the input data parallelism. The main novelty is the incorporation of three different acceleration or bypass mechanisms jointly. The configurable multiplier has three variable precision configuration options: a 32x32-bit, two 16x16-bit, and four 8x8-bit multipliers. This bypassing architecture seamlessly integrates DML logic, which supports two modes of operation: a high-performance dynamic mode and a low-energy consumption static mode, with smooth mode switching capabilities. By optimizing the DML mode based on the multiplier's bit-width, the design enhances active computational block utilization, overall performance, and energy efficiency. In the dynamic mode, the DML implementation achieves an average performance improvement of 15% for the 32-bit, 8% for the 16-bit, and 7% for the 8-bit multipliers compared to the CMOS implementation. In the static mode, the DML implementation demonstrates an average energy reduction of 28%. When running in combined mode, where the 32-bit multiplier operates in dynamic mode for acceleration and the 8-bit multiplier operates in static mode for energy savings, the DML implementation exhibits an average overall performance gain of 15% and up to 18% lower energy consumption. The nontrivial semi-automation flow utilized for the complex implementation of the proposed architecture is also presented.