The Branch and Bound algorithm (B&B) is a well known method for solving optimally Combinatorial Optimization Problems (COPs). This method is based on intelligent enumeration of all feasible solutions which reduce considerably the search space. Nevertheless, it remains inefficient when using the sequential approach to deal with large problem instances due to its huge resolutions time. However, the execution time can be reduced considerably by using parallel computing architectures. With the huge evolution of the multi-cores CPUs and GPUs, It is quite hard to design schemes that efficiently exploit the different hardware architectures simultaneously. As a result, most of the existing works focus on exploiting one hardware architecture at a time. In this paper, we propose five parallel approaches to accelerate the B&B execution time using Multi and Many-core systems at the same time. Our goal is to solve optimally the Blocking Job Shop Scheduling problem (BJSS) which is one of the hardest scheduling problem. The first proposed approach is a multi-search parallelization based on Master/Worker paradigm, exploiting the Multi-Core CPU-processors. The second and the third approaches represent a GPU based parallelization schemes having different level of parallelism and GPU occupancy. The forth and fifth approaches represent a hybridization between the Muli-core approach and the GPU based parallelization approaches. The goal of this hybridization is to benefit from the power of both the CPU-cores and the GPU at the same time. This hybridization is based on concurrent kernels execution provided by Nvidia Multi process Service (MPS) which allows multiple host processes (Master and workers) to use at the same time the GPU to launch their kernels in order to accelerate the bounding of one or several nodes at a time. Experiments using the well known Taillard instances confirm the efficiency of our proposals and show a relative speedup of 160x as compared to an optimized sequential B&B algorithm.