A covariance matrix self-adaptation evolution strategy (CMSA-ES) was compared with several metaheuristic techniques for multilayer perceptron (MLP)-based function approximation and classification. Function approximation was based on simulations of several 2D functions and classification analysis was based on nine cancer DNA microarray data sets. Connection weight learning by MLPs was carried out using genetic algorithms (GA-MLP), covariance matrix self-adaptation-evolution strategies (CMSA-ES-MLP), back-propagation gradientbased learning (MLP), particle swarm optimization (PSO-MLP), and ant colony optimization (ACO-MLP). During function approximation runs, input-side activation functions evaluated included linear, logistic, tanh, Hermite, Laguerre, exponential, and radial basis functions, while the output-side function was always linear. For classification, the input-side activation function was always logistic, while the output-side function was always regularized softmax. Self-organizing maps and unsupervised neural gas were used to reduce dimensions of original gene expression input features used in classification. Results indicate that for function approximation, use of Hermite polynomials for activation functions at hidden nodes with CMSA-ES-MLP connection weight learning resulted in the greatest fitness levels. On average, the most elite chromosomes were observed for MLP (MSE ¼ 0:4977), CMSA-ES-MLP (0.6484), PSO-MLP (0.7472), ACO-MLP (1.3471), and GA-MLP (1.4845). For classification analysis, overall average performance of classifiers used was 92.64% (CMSA-ES-MLP), 92.22% (PSO-MLP), 91.30% (ACO-MLP), 89.36% (MLP), and 60.72% (GA-MLP).We have shown that a reliable approach to function approximation can be achieved through application of MLP connection weight learning when the assumed function is unknown. In this scenario, the MLP architecture itself defines the equation used for solving the unknown parameters relating input and output target values. A major drawback of implementing CMSA-ES into an MLP is that when the number of MLP weights is large, the OðN 3 Þ Cholesky factorization becomes a bottleneck for performance. As an alternative, feature reduction using SOM and NG can greatly enhance performance of CMSA-ES-MLP by reducing N: Future research into the speeding up of Cholesky factorization for CMSA-ES will be helpful in overcoming time complexity problems related to a large number of connection weights.