.Automatic target recognition (ATR) is a challenging task for several computer vision applications. It requires efficient, accurate, and robust methods for target detection and target identification. Deep learning has shown great success in many computer vision applications involving color RGB images. However, the performance of these networks in ATR with infrared sensor data needs further investigation. In this paper, we propose a multistage automatic target detection and recognition (ATDR) system that performs both target detection and target classification on infrared (IR) imagery using deep learning. Our system processes large IR image frames where targets take <1 % of the total number of pixels. First, we train a state-of-the-art object detector you only look once (YOLO) to localize all potential targets in the input image frame. Then, we train a convolutional neural network (CNN) to identify these detections as targets or false alarms. In this second phase, we adapt and analyze the performance of three CNN architectures: a compact and fully connected CNN, VGG16 with batch normalization, and a wide residual neural network (WRN). We also explore the use of a loss function that optimizes directly the area under the receiver operating characteristic (ROC) curve (AUC), and adapt it to our ATR application. To enhance the robustness of the proposed ATR to perturbation and variations introduced during the detection stage, we train our CNN classifiers on automatically detected targets using YOLO, in addition to ground truth bounding boxes and apply selected data augmentation techniques. To simulate real testing environments, where the spatial location of the targets within the image frame is unknown, only YOLO-detected boxes are used during validation. We evaluate our ATDR on a real benchmark dataset that includes different vehicles captured at different resolutions. Our experiments have shown that YOLO can detect most of the targets at the expense of generating a high number of false alarms. We show that the VGG-16 network with batch normalization, which is the best performing model, can correctly identify the classes of the targets, as well as classify the majority of YOLO’s false detections into an additional nontarget class. We also show that the proposed training modification to optimize an AUC-based loss function for ATR proved to be advantageous mainly in identifying difficult targets.
Lokator is a baseball training system designed to document pitch location while teaching pitch command, selection and sequencing. It is composed of a pitching target and a smartphone app. The target is divided into a set of zones to identify the pitch location. The main limitation of the current system is its reliance on the user's feedback. After each throw, the pitcher or the coach needs to identify and report the target's zone that was hit by the ball by just relying on the naked eye. The purpose of this thesis is to investigate the possibility of using computer vision technology to automate the pitch analysis in baseball and improve the usability and accuracy of the Lokator system. Towards this goal, we have developed, implemented and tested a computer vision-based software system that adds the following contributions to the Lokator system: 1. Automated and accurate reading of the pitch location on the target. 4. Replace the target by a catcher and estimate the pitch location using a virtual target.We have tested the software on a large set of recording. Those recordings are from indoor and outdoor environments with various illumination conditions and different backgrounds. The software was also tested on videos with softball pitches. To estimate the accuracy of the software, the sponsor gave us a set of 15 videos that include a total of 144 pitches along with the hit location iv of each pitch. Another set of 8 videos were provided to measure the accuracy of our software in terms of speed calculation.v
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.