AbstractSystems biology applies concepts from engineering in order to understand biological networks. If such an understanding was complete, biologists would be able to design ad hoc biochemical components tailored for different purposes, which is the goal of synthetic biology. Needless to say that we are far away from creating biological subsystems as intricate and precise as those found in nature, but mathematical models and high throughput techniques have brought us a long way in this direction. One of the difficulties that still needs to be overcome is finding the right values for model parameters and dealing with uncertainty, which is proving to be an extremely difficult task. In this work, we take advantage of ensemble modeling techniques, where a large number of models with different parameter values are formulated and then tested according to some performance criteria. By finding features shared by successful models, the role of different components and the synergies between them can be better understood. We will address some of the difficulties often faced by ensemble modeling approaches, such as the need to sample a space whose size grows exponentially with the number of parameters, and establishing useful selection criteria. Some methods will be shown to reduce the predictions from many models into a set of understandable “design principles” that can guide us to improve or manufacture a biochemical network. Our proposed framework formulates models within standard formalisms in order to integrate information from different sources and minimize the dimension of the parameter space. Additionally, the mathematical properties of the formalism enable a partition of the parameter space into independent subspaces. Each of these subspaces can be paired with a set of criteria that depend exclusively on it, thus allowing a separate sampling/screening in spaces of lower dimension. By applying tests in a strict order where computationally cheaper tests are applied first to each subspace and applying computationally expensive tests to the remaining subset thereafter, the use of resources is optimized and a larger number of models can be examined. This can be compared to a complex database query where the order of the requests can make a huge difference in the processing time. The method will be illustrated by analyzing a classical model of a metabolic pathway with end-product inhibition. Even for such a simple model, the method provides novel insight.Author summaryA method is presented for the discovery of design principles, understood as recurrent solutions to evolutionary problems, in biochemical networks.The method takes advantage of ensemble modeling techniques, where a large number of models with different parameter values are formulated and then tested according to some performance criteria. By finding features shared by successful models, a set of simple rules can be identified that enables us to formulate new models that are known to perform well, a priori. By formulating the models within the framework of Biochemical Systems Theory (BST) we manage to overcome some of the obstacles often faced by ensemble modeling. Further analysis of the selected modeling with standard machine learning techniques enables the formulation of simple rules – design principles – for building good performing networks. We illustrate the method with a well-known case study: the unbranched pathway with end-product inhibition. The method manages to identify the known features of this well-studied pathway while providing additional guidelines on how the pathway kinetics can be tuned to achieve a desired functionality – e.g. demand vs supply control – as well as to identifying important tradeoffs between performance, robustness and and stability.