Feature selection for transient classification is the problem of choosing among several monitored parameters (i.e., the features) to be used for efficiently recognizing the developing transient patterns. It is a critical issue for the application of "on condition" diagnostic techniques in complex systems, such as the nuclear power plants, where hundreds of parameters are measured. Indeed, irrelevant and noisy features have been shown to unnecessarily increase the complexity of the classification problem and degrade the diagnostic performance. In this paper, the problem of selecting the features to be used for efficient transient classification is tackled by means of multiobjective genetic algorithms. The approach leads to the identification of a family of equivalently optimal subsets of features, in the Pareto sense. However, difficulties in the convergence of the standard Pareto-based multiobjective genetic algorithm search in large feature spaces may arise in terms of representativeness of the identified Pareto front whose elements may turn out to be unevenly distributed in the objective functions space, thus not providing a full picture of the potential Pareto-optimal solutions. To overcome this problem, a niched Pareto genetic algorithm is embraced in this work. The performance of the feature subsets examined during the search is evaluated in terms of two optimization objectives: the classification accuracy of a Fuzzy K-Nearest Neighbors classifier and the number of features in the subsets. During the genetic search, the algorithm applies a controlled "niching pressure" to spread out the population in the search space so that convergence is shared on different niches of the Pareto front, which is thus evenly covered. The method is tested on a diagnostic problem characterized by a very large number of process features available for the classification of simulated transients in the feedwater system of a boiling water reactor. The dynamics of the transient signals is captured by wavelet decomposition, which actually increases the complexity of the search for the optimal feature subsets by triplicating the number of features to be considered. C 2008 Wiley Periodicals, Inc.