W e study the coarse-grained parallelization of an efficient bundle-based costdecomposition algorithm for the solution of multicommodity min-cost flow (MMCF) problems. We show that a code exploiting only the natural parallelism inherent in the costdecomposition approach, i.e., solving the min-cost flow subproblems in parallel, obtains satisfactory efficiencies even with many processors on large, difficult MMCF problems with many commodities. This is exactly the class of instances where the decomposition approach attains its best results in sequential. The parallel code we developed is highly portable and flexible, and it can be used on different machines. We also show how to exploit a common characteristic of current supercomputer facilities, i.e., the side-to-side availability of massively parallel and vector supercomputers, to implement an asymmetric decomposition algorithm where each architecture is used for the tasks for which it is best suited.