In this research work, we propose a novel embedded dual-execution mode 32-bit processor architecture (QSP32), which supports queue and stack programming models. The QSP32 core is based on a high performance produced order parallel queue architecture and is targeted for applications constrained in terms of area, memory, and power requirements. The design focuses on the ability to execute queue programs and also to support stack programs without a considerable increase in hardware to the base queue architecture. A prototype implementation of the processor is produced by synthesizing the high level model for a target FPGA device. We present the architecture description and design results in a fair amount of details. From the design and evaluation results, the QSP32 core efficiently executes both queue and stack based programs and achieves on average about 65 MHz speed. In addition, when compared to the base single-mode architecture (PQP), the QSP32 core requires only about 2.41% additional hardware. Moreover, the prototype fits on a single FPGA device, thereby eliminating the need to perform multi-chip partitioning which results in a loss of resource efficiency.