PurposeThe purpose of this paper is to introduce some basic knowledge of statistical learning theory (SLT) based on random set samples in set‐valued probability space for the first time and generalize the key theorem and bounds on the rate of uniform convergence of learning theory in Vapnik, to the key theorem and bounds on the rate of uniform convergence for random sets in set‐valued probability space. SLT based on random samples formed in probability space is considered, at present, as one of the fundamental theories about small samples statistical learning. It has become a novel and important field of machine learning, along with other concepts and architectures such as neural networks. However, the theory hardly handles statistical learning problems for samples that involve random set samples.Design/methodology/approachBeing motivated by some applications, in this paper a SLT is developed based on random set samples. First, a certain law of large numbers for random sets is proved. Second, the definitions of the distribution function and the expectation of random sets are introduced, and the concepts of the expected risk functional and the empirical risk functional are discussed. A notion of the strict consistency of the principle of empirical risk minimization is presented.FindingsThe paper formulates and proves the key theorem and presents the bounds on the rate of uniform convergence of learning theory based on random sets in set‐valued probability space, which become cornerstones of the theoretical fundamentals of the SLT for random set samples.Originality/valueThe paper provides a studied analysis of some theoretical results of learning theory.
Some properties of Sugeno measure are further discussed, which is a kind of typical nonadditive measure. The definitions and properties of g λ random variable and its distribution function, expected value, and variance are then presented. Markov inequality, Chebyshev's inequality and the Khinchine's Law of Large Numbers on Sugeno measure space are also proven. Furthermore, the concepts of empirical risk functional, expected risk functional and the strict consistency of ERM principle on Sugeno measure space are proposed. According to these properties and concepts, the key theorem of learning theory, the bounds on the rate of convergence of learning process and the relations between these bounds and capacity of the set of functions on Sugeno measure space are given.Keywords: Sugeno measure, the empirical risk minimization principle, the key theorem, the bounds on the rate of uniform convergence.In the 1970s, Vapnik [1][2][3] proposed the Statistical Learning Theory (SLT), which deals mainly with the statistical learning principles when samples are limited. SLT is an important development and supplement of traditional statistics, whose kernel idea is to control the generalization ability of learning machine by capacity control. At the same time, a novel pattern recognition approach called the Support Vector Machine (SVM) was developed. In more than 30 years, a flood of monograph and theses on SLT appear in the world [1][2][3][4][5][6][7][8][9][10][11][12][13] . At present, SLT and SVM have become a new and hot research issue [4][5][6][7][8] in the field of machine learning.The key theorem of learning theory and the bounds on the rate of uniform convergence of learning process are important constituent parts of SLT. The key theorem of learning theory replaces the problem of the strict consistency of the Empirical Risk Minimization
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.