Feedforward neural networks have been established as versatile tools for nonlinear black-box modeling, but in many data-mining tasks the choice of relevant inputs and network complexity still constitute major challenges. Statistical tests for detecting relations between inputs and outputs proposed in the literature are largely based on the theory for linear systems, and laborious retraining combined with the risk of getting stuck in local minima make the application of exhaustive search through all possible network configurations impossible but for toy problems. This paper proposes a systematic method to tackle the problem where an output shall be estimated on the basis of a (large) set of potential inputs. Feedforward neural networks of multilayer perceptron type are used in the three-stage approach: First, starting from sufficiently large networks, an efficient pruning method is applied to detect potential model candidates. Next, the best results of the pruning runs are extracted by forming a Pareto-frontier, with the contradictory objectives of minimizing network complexity and estimation error. The networks on this frontier are considered to contain promising hidden nodes with their specific connections to relevant input variables. These hidden nodes are therefore optimally combined by mixed-integer linear programming to form a final set of neural network models, from which the user can select a model of suitable complexity. The modeling method is applied on an illustrative test example as well as on a complex modeling problem in the metallurgical industry, i.e., prediction of the silicon content of hot metal produced in a blast furnace. It is demonstrated to find relevant inputs and to yield parsimonious sparsely connected neural models of the output.