This paper considers robust classification as a constrained optimization problem. Where the constraints are nonlinear, inequalities defining separating surfaces, whose half spaces include or exclude the data depending on their classes and the cost, are used for attaining robustness and providing the minimum volume regions specified by the half spaces of the surfaces. The constraints are added to the cost using penalty functions to get an unconstrained problem for which the gradient descent method can be used. The separating surfaces, which are aimed to be found in this way, are optimal in the input data space in contrast to the conventional support vector machine (SVM) classifiers designed by the Lagrange multiplier method, which are optimal in the (transformed) feature space. Two types of surfaces, namely hyperellipsoidal and Gaussian-based surfaces created by radial basis functions (RBFs), are focused on in this paper due to their generality. Ellipsoidal classifiers are trained in 2 stages: a spherical surface is found in the first stage, and then the centers and the radii found in the first stage are taken as the initial input for the second stage to find the center and covariance matrix parameters of the ellipsoids. The penalty function approach to the design of robust classifiers enables the handling of multiclass classification. Compared to SVMs, multiple-kernel SVMs, and RBF classifiers, the proposed classifiers are found to be more efficient in terms of the required training time, parameter setting time, testing time, memory usage, and generalization error, especially for medium to large datasets. RBF-based input space optimal classifiers are also introduced for problems that are far from ellipsoidal, e.g., 2 Spirals.