The kernel method is a tool that converts data to a kernel space where operation can be performed. When converted to a high-dimensional feature space by using kernel functions, the data samples are more likely to be linearly separable. Traditional machine learning methods can be extended to the kernel space, such as the radial basis function (RBF) network. As a kernel-based method, support vector machine (SVM) is one of the most popular nonparametric classification methods, and is optimal in terms of computational learning theory. Based on statistical learning theory and the maximum margin principle, SVM attempts to determine an optimal hyperplane by addressing a quadratic programming (QP) problem. Using Vapnik–Chervonenkis dimension theory, SVM maximizes generalization performance by finding the widest classification margin within the feature space. In this paper, kernel machines and SVMs are systematically introduced. We first describe how to turn classical methods into kernel machines, and then give a literature review of exciting kernel machines.We then introduce the SVM model, its principles, and various SVM training methods for classification, clustering, and regression. Related topics, including optimizing model architecture, are also discussed. We conclude by outlining future directions for kernel machines and SVMs. This article functions both as a state-of-the-art survey and a tutorial.