In this paper, we elaborate on the use of the Sugeno integral in the context of machine learning. More specifically, we propose a method for binary classification, in which the Sugeno integral is used as an aggregation function that combines several local evaluations of an instance, pertaining to different features or measurements, into a single global evaluation. Due to the specific nature of the Sugeno integral, this approach is especially suitable for learning from ordinal data, that is, when measurements are taken from ordinal scales. This is a topic that has not received much attention in machine learning so far. The core of the learning problem itself consists of identifying the capacity underlying the Sugeno integral. To tackle this problem, we develop an algorithm based on linear programming. The algorithm also includes a suitable technique for transforming the original feature values into local evaluations (local utility scores), as well as a method for tuning a threshold on the global evaluation. To control the flexibility of the classifier and mitigate the problem of overfitting the training data, we generalize our approach toward k-maxitive capacities, where k plays the role of a hyper-parameter of the learner. We present experimental studies, in which we compare our method with competing approaches on several benchmark data sets.