Current research in interfacing clusters within Hierarchical Networks-on-Chip (HNoC) as well as interfacing NoC-based systems adopts a centralized approach. In this approach, a specific Processing Element (PE) acts as a gateway between interfacing peripherals and the rest of NoC elements. This paper evaluates this approach and show that it is not optimal for handling the inter-NoC communication. Routing inter-NoC traffic through a system to its gateway PE deteriorates the network performance. Results show that both the throughput and latency of the centralized approach degrade with the increase in the inter-NoC traffic bandwidth. To alleviate this, we propose a novel distributed approach, which separates the inter-NoC traffic from the intra-NoC one. Our approach relies on distributed buffers to allow PEs to efficiently communicate with the interfacing peripheral. We evaluate our approach against other interfacing ones using synthetic traffic as well as real benchmark applications. Our evaluation covers the whole system performance as well as its inter-and intra-NoC parts. Results prove that the proposed approach outperforms previous interfacing ones in terms of throughput and latency. The proposed approach significantly enhances the inter-NoC performance without any deterioration in the intra-NoC one. Considering the inter-NoC performance, we achieve a throughput that is close to the maximum possibly attainable one. Other approaches show major performance degradation, reaching as low as 10% of this maximum attainable throughput. INDEX TERMS Hierarchical Networks-on-Chip (HNoC), Inter-NoC communication, Intra-NoC communication, NoC benchmarks, NoC Ethernet, NoC high-speed interfacing, NoC Time Division Multiple Access (TDMA), NoC traffic.