This work presents an extensive case study on the model-based design of a commercial-grade stream processing middleware (IBM's InfoSphere Streams) its runtime and language (SPL) compiler. The model-based underpinnings are pervasive throughout the whole environment, from describing inter-process communication interfaces and objects to the design of the extensibility mechanism in the runtime and language. In addition to many software engineering advantages such as consistent, uniform, and self-documented integration among the different parts of the system, we show intrinsic performance benefits to the platform derived from this design approach. First, we demonstrate how an incremental compilation strategy employed by the SPL compiler and rooted on the model description of the application, extracted by the compiler as part of the application building process, leads to better compile-time performance. Second, we discuss how the model-based code generation strategy employed by the SPL compiler also leads to increased runtime performance, by specializing the generated code to particular characteristics of the runtime environment. Finally, we show how the extensibility strategy used in the SPL language leads to automatic syntactic and semantic checks at compile time, while enabling behavioral reasoning and specific optimizations at runtime. 1364 B. GEDIK AND H. ANDRADE providing additional capabilities to address new requirements imposed by customers and evolving business demands.The model-based framework forms the basis of the extensibility provided by our language and runtime system. Their research prototype roots and evolution into a commercial-grade software platform are briefly discussed in Section 2. In Section 3, we look in detail at how we used these ideas throughout the entire system and the specific benefits we derived from relying on this approach. For example, a model description can be used to communicate information between the different parts of a system that must interact with one another. Furthermore, we use the notion of a model to capture and expose as much information as deemed necessary to allow detailed reasoning about a program to be carried out at compile time as is described in Section 4. As will be shown, this approach minimizes runtime overheads that are typical of stream computing platforms that rely on interpreting queries, as is the case for many SQL query engines. This leads us to the runtime environment and its model-driven backbone, which we discuss in Section 5.Specifically, this paper presents a model-driven framework we used to develop a new stream processing language (SPL) as well as the distributed runtime (InfoSphere Streams) supporting this language, starting with their research counterparts, respectively, the SPADE language and the System S middleware.Approaching a design task as a model-driven process, whereby a common integration framework is made explicit, is a methodology that has been used in disparate areas such as manufacturing, embedded systems, communication systems, as ...