The relevance of the tasks of detecting and recognizing objects in images and their sequences has only increased over the years. Over the past few decades, a huge number of approaches and methods for detecting both anomalies, that is, image areas whose characteristics differ from the predicted ones, and objects of interest, about the properties of which there is a priori information, up to the library of standards, have been proposed. In this work, an attempt is made to systematically analyze trends in the development of approaches and detection methods, reasons behind these developments, as well as metrics designed to assess the quality and reliability of object detection. Detection techniques based on mathematical models of images are considered. At the same time, special attention is paid to the approaches based on models of random fields and likelihood ratios. The development of convolutional neural networks intended for solving the recognition problems is analyzed, including a number of pre-trained architectures that provide high efficiency in solving this problem. Rather than using mathematical models, such architectures are trained using libraries of real images. Among the characteristics of the detection quality assessment, probabilities of errors of the first and second kind, precision and recall of detection, intersection by union, and interpolated average precision are considered. The paper also presents typical tests that are used to compare various neural network algorithms.
<p><strong>Abstract.</strong> The problem of detecting objects on a sequence of images with a complex structure is considered. Optimal and quasi-optimal algorithms for processing multidimensional images have been synthesized and investigated. Improved detection efficiency has been obtained by adequately describing real data using doubly stochastic random fields. The possibility of describing Earth remote sensing data using doubly stochastic models is investigated. The possibility of obtaining significant gains when filtering satellite material and detecting extended objects on it due to the adaptive structure of such models and processing time sequence of multizone images as a single multidimensional dataset is shown. The gains for filtering algorithms in the error variance are about 80% comparing single frame processing, and the gains for detecting algorithms in the signal/noise ratio are about 70% comparing single frame processing.</p>
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.