Network processor systems provide the performance of ASICs combined with the programmability of general-purpose processors. One of the main challenges in designing these systems is the memory subsystem used when forwarding and queueing packets. In this work, we study the queueing behavior and packet delays in a network processor system which works as a router. We introduce a system model and a simulation tool based on the model. Using the simulation tool, both best-effort and diffserv IPv4 forwarding were modeled and tested using real-world and synthetically generated packet traces. The results on queueing behavior have been used to dimension various queues, and can be used as guidelines for designing memory subsystems and queueing disciplines. In particular, a system with small queue sizes has been proposed. The results on packet delays also show that our diffserv setup provides good service differentiation for best-effort and priority packets. Finally, the study reveals that the choice of traces has a large impact on the results when evaluating router and switch architectures.