Wire latency across the links of a NoC can limit throughput, especially in deep submicron technology. Stateful pipeline buffers added to long links allow a higher clock rate, but this wastes resources on links needing only low bandwidth. In asynchronous (clockless) NoCs, link pipelining can be done to only those that will benefit from both increased throughput and buffering capacity, and is especially useful in heterogeneous embedded SoCs. We evaluate two strategies that determine where link pipeline buffers should be placed in the topology. The first compares available link bandwidth, based on physical wirelength, to the throughput needed by each source-to-destination path, for each link. The second adds buffers to a link such that its bandwidth is at least equal to the throughput of a core's network adapter. These strategies were integrated into our network optimization tool for an application-specific SoC. Simulations were based on its expected traffic patterns, floorplan-derived wirelength, and uses self-similar traffic generation for more realistic behavior. Results show improved large-message network latency and output buffer delay of the network adapter. There was a slight power increase with the addition of pipeline buffers, but our proposal is a complexity-effective improvement by the power*latency product metric. The results indicate the strategy of pipelining certain links provides more efficiency opposed to a ubiquitous addition of buffers.