With dozens to hundreds of processing cores deployed in next generation packet processor, regular topologies such as mesh are widely adopted in Network-on-Chip design to provide scalable interconnection to cores. Although such packet processors are rich in raw system processing power, utilization of hardware resource plays a critical role in overall system performance. In this paper, we focus on processing task mapping and on-chip packet routing, which are the key issues for data path performance on next-generation packet processors. We present a genetic algorithm to explore the assignment of tasks, and utilize on-chip interconnections by splitting the traffic between cores across multiple paths. The split flow traffic assigned to each routing path is solved with linear programming. Our experimental results on a packet processor architecture prototype show that the proposed algorithm is efficient and scalable.Index Terms-network processor, runtime management, task allocation.