Vision-based vehicle detection approaches achieve incredible success in recent years with the development of deep convolutional neural network (CNN). However, existing CNNbased algorithms suffer from the problem that the convolutional features are scale-sensitive in object detection task but it is common that traffic images and videos contain vehicles with a large variance of scales. In this paper, we delve into the source of scale sensitivity, and reveal two key issues: 1) existing RoI pooling destroys the structure of small scale objects; 2) the large intra-class distance for a large variance of scales exceeds the representation capability of a single network. Based on these findings, we present a scale-insensitive convolutional neural network (SINet) for fast detecting vehicles with a large variance of scales. First, we present a context-aware RoI pooling to maintain the contextual information and original structure of small scale objects. Second, we present a multi-branch decision network to minimize the intra-class distance of features.These lightweight techniques bring zero extra time complexity but prominent detection accuracy improvement. The proposed techniques can be equipped with any deep network architectures and keep them trained end-to-end. Our SINet achieves stateof-the-art performance in terms of accuracy and speed (up to 37 FPS) on the KITTI benchmark and a new highway dataset, which contains a large variance of scales and extremely small objects.
A s a driving force of the current technological transformation, robust and trustworthy artificial intelligence is in greater need than ever. Despite achieving expert-level accuracies on many disease-screening tasks (1-9), deep learning (DL)-based (10) artificial intelligence models can make correct decisions for the wrong reasons (11-13) and demonstrate considerably degraded performance when applied to external data (13)(14)(15). This phenomenon is referred to as "shortcut learning" ( 16), wherein deep neural networks unintendedly learned dataset biases (17) to fit the training data quickly. Specifically, dataset biases are the patterns that frequently co-occurred with the target disease and are more easily recognized than the true disease signs (18). Although widely adopted DL diagnosis models are often developed with image-level binary annotations (with "1" indicating the presence and "0" indicating the absence of disease), such spurious correlations could be
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.