Visual sensors provide comprehensive and abundant information of surrounding environment. In this chapter, we will first give a brief introduction of visual recognition with basic concepts and algorithms, followed by the introduction of the useful software toolkit JavaScript Object Notation (JSON) framework. Then we will review a series of vision-based object recognition and tracking techniques. The detection and classification of pedestrians in infrared thermal images is investigated using deep learning method. And the algorithm for tracking single moving objects based on JSON visual recognition framework is also introduced. As the extension of single moving objects tracking, we will also show the visual tracking to multiple moving objects with the aid of the particle swarm optimization (PSO) method.
Introduction of Machine Vision RecognitionFrom our human's perspective, recognition is about searching and comparison. Our brain has a huge knowledge base, storing tens of thousands of objects. Sometimes images themselves are stored in the brain, in other cases only abstract features are stored. Those images and corresponding features are called references. Then, our eyes are opened, new images are transmitted to the brain. Those test images are compared with references, and conclusions come out.The visual recognition in machine is in some extent similar to humans' recognition. In order to recognize an object in a given image, machine needs to know what the object looks like. It learns the knowledge by loading some references before hand. Then, given a test image, matches between the references and the test samples are constructed. The machine then decides if there exists the wanted object in the test image, and locates it if possible, by analyzing the result of matches. Generally speaking, recognition is divided into two parts: detecting and locating. Detecting checks whether there exists a specific kind of object in the image. Then, locating marks regions of the detecting objects in the image.There are three key concepts that are helpful to understand the recognition problem: feature space, similarity metric, and search space and strategy. Feature space: Feature space determines the information used in matching. It may be the image itself, but specific features are used more often: edges, contours, surfaces, corners, etc. Intrinsic structures which imply the invariant properties of an image are preferred features.Similarity metric: The second key concept is related to the selection of a similarity metric. Since the metric measures the similarity among feature vectors, it is closely related to the selection of features. Typical similarity metrics include crosscorrelation, sum of absolute difference, Fourier phase correlation, sum of distance of nearest points, and many others. The choice of similarity metric is also thought to be an important task to determine the transformation between two images.Search space and strategy: The third key concept is search space and strategy. If the test image can be obtain...