Abstract:In recent years, the growth on the number of cores as well as the frequency of cores along different processor generations has proportionally increased bandwidth needs simultaneously in both CPU and GPU systems. In order to address the communication latency between CPU and GPU memories in recent implementation of heterogeneous mobile embedded systems with hard or firm real-time requirements, sharing the same address space adds significant levels of contention. In addition, when heterogeneous cores are simultaneously present in a single system, memory parallelism is significantly restricted by a small amount of memory controllers (MCs). As a strategy to approach these significant levels of memory pressure, it is proposed in this paper evaluations of the impact of scaling MCs up to 4-8 units -limited by motherboard size for embedded purposes. Our findings show that performance is enhanced by a factor of 4x when employing only CPU cores, 4.6x when only GPU cores and finally, 2x when both CPU and GPU cores are simultaneously considered.