Embedded systems are widely used in many sophisticated applications. To speed the time-to-market cycle, the hardware and software co-design has become one of the main methodologies in modern embedded systems. The most important challenge in the embedded system design is partitioning; i.e. deciding which modules of the system should be implemented in hardware and which ones in software. Finding an optimal partition is hard because of the large number and different characteristics of the modules that have to be considered.In this article, we develop a new high-level hardware/software partitioning methodology. Two novel features characterize this methodology. Firstly, the Particle Swarm Optimization (PSO) technique is introduced to the Hardware/Software partitioning field. Secondly, the hardware is modeled using two extreme implementations that bound different hardware scheduling alternatives. Our methodology further partitions the design into hardware and software modules at the early Control-Data Flow Graph (CDFG) level of the design; thanks to improved modeling techniques using intermediate-granularity functional modules. A new restarting technique is applied to PSO to avoid quick convergence. This technique is called Re-Excited PSO. Our numerical results prove the usefulness of the proposed technique.The target technology is Field Programmable Gate Arrays (FPGAs). We developed FPGA-based estimation techniques to evaluate the costs of implementing the design components. These costs are the area, delay, latency, and power consumption for both the hardware and software implementations. Hardware/software communication is also taken into consideration. This research is an extended version of IESS 2007 conference paper [5]. M.B. Abdelhalim ( ) College 20 M.B. Abdelhalim et al.The aforementioned methodology is embodied in an integrated CAD tool for hardware/software co-design. This tool accepts behavioral, un-timed, algorithmic-level, VHDL, design representation, and outputs a valid hardware/software partition and schedule for the design subject to a set of area/power/delay constraints. This tool is code named CUPSHOP for (Cairo University PSo-based Hardware/sOftware Partitioning tool). Finally, a JPEGencoder case study is used to validate and contrast our partitioning methodology against the prior-art methodologies.