even more difficult, automatic data annotation problems are often considered as high-class imbalance problems. In this paper we address the basic recognition modelthe linear perceptron. On top of it, many other, more complex solutions may be proposed. The presented research is done from the perspective of automatic data annotation. 1.1 Linear recognition models Training of linear models has a long history. One shall note the classic Fisher's Linear Discriminant Analysis (LDA, e.g., [2]). Existence of closed-form, analytical solution is the largest advantage of discriminant analysis (both linear and quadratic). A disadvantage of linear discriminant analysis is the assumption on equality of covariance matrices for both classes. Also, it can cause typical difficulties related to zeroed or near-zeroed generalized variance [6] and covariance matrix inverse problems, especially for data with a large number of attributes. One of the possible solutions is to filter out attributes with zeroes-related eigenvalues [6]. Another possible solution is the use Regularized Discriminant Analysis (RDA) [3]. The basic assumption is that some recognition problems may be ill-posed due to insufficient amount of data comparing to the number of attributes. It combines together covariance matrix, diagonal variance matrix and identity matrix and thus makes the training process solvable. An interesting solution for LDA covariance matrix calculation is given by Fukunaga [4]. It combines both covariance matrices using weighted average instead of average, as originally proposed by Fisher. An extension of LDA is Kernel-LDA [5], which uses kernel trick known from Support Vector Machines to address linearly non-separable problems. The second family of approaches to train linear models is logistic regression (e.g., [6]). Logistic regression is Abstract Delta rule is a standard, well-established approach to train perceptron recognition model. However, mean squared error, on which it is based, is not suitable estimate for some problems, like information retrieval or automatic data annotation. F-score, a combination of precision and recall, is one of the major quality measures and can be used as an alternative. In this paper we present perceptron training model based on f-score. An approximate of f-score is proposed, based on components which are both continuous and differentiable. It allows to formulate a gradient-descent training routine, conceptually similar to the standard delta rule.