Emerging embedded systems require heterogeneous multiprocessor SoC architectures that can satisfy both high-performance and programmability. However, as the complexity of embedded systems increases, software programming on an increasing number of multiprocessors faces several critical problems, such as multithreaded code generation, heterogeneous architecture adaptation, short design time, and low cost implementation. In this paper, we present a software code generation flow based on Simulink to address these problems. We propose a functional modeling style to capture data-intensive and controldependent target applications, and a system architecture modeling style to seamlessly transform the functional model into the target architecture. Both models are described using Simulink. From a system architecture Simulink model, a code generator produces a multithreaded code, inserting thread and communication primitives to abstract the heterogeneity of the target architecture. In addition, the multithread code generator called LESCEA applies the extensions of dataflow based memory optimization techniques, considering both data and control dependency. Experimental results on a Motion-JPEG decoder and an H.264 decoder show that the proposed multithread code generator enables easy software programming on different multiprocessor architectures with substantially reduced data memory size (up to 68.0%) and code memory size (up to 15.9%).