Scalability and performance in multicore processors for embedded and real-time systems usually don't go well each with the other. Networks on Chip (NoCs) provide scalable execution platforms suitable for such kind of embedded systems. This article presents a NoC-based Heterogeneous Multi-Processor system, called NoC-HMP, which is a scalable platform for embedded systems developed in the GALS language SystemJ. NoC-HMP uses a time-predictable TDMA-MIN NoC to guarantee latencies and communication time between the two types of time-predictable cores and can be customized for a specific performance goal through the execution strategy and scheduling of SystemJ program deployed across multiple cores. Examples of different execution strategies are introduced, explored and analyzed via measurements. The number of used cores can be minimized to achieve the target performance of the application. TDMA-MIN allows easy extensions of NoC-HMP with other cores or IP blocks. Experiments show a significant improvement of performance over a single core system and demonstrate how the addition of cores affects the performance of the designed system.