Design space exploration of a configurable, heterogeneous system for a given application with required throughput searches for a combination of cores or softcores with different architectures that can be accommodated within the given ASIC or FPGA area and that achieves the required throughput and optimizes power consumption. For a soft real-time streaming application, modeled as a task graph with internally parallelizable streaming tasks, this requires assigning a core type and quantity and DVFS frequency level to each task, which implies task runtime and energy consumption, and mapping and scheduling the tasks, such that the throughput requirement is met. We tightly integrate such static scheduling for stream processing applications with design space exploration of the best heterogeneous core combination, and solve the resulting combined optimization problem by an integer linear program (ILP). We evaluate our solution for different numbers of core types on synthetic and application-based task graphs, and demonstrate improvements of up to 34.8% for ARM big and LITTLE cores, and 70.5% for 3 different core types.