Deep learning algorithms are highly effective at handling complex and challenging tasks such as image classification and detection. Over the past few decades, there have been a variety of convolutional neural networks (CNNs) with varying architectures to improve accuracy in object detection. However, a variety of factors, including brightness, enclosures, viewing distances, and background components, can affect the appearance, size, and shape of objects making the object detection task even more challenging. A multi-head external attention mechanism-based CenterNet model has been proposed to enhance the accuracy in detection of objects. The feature extraction process is carried out using Hourglass-104 network and Adaptive Feature Pyramid Network (AFPN). High-level features are derived through contextual modeling using an adaptive feature fusion (AFF) module with multi-head external attention. A hybrid feature selection method called adaptive hybrid feature selection (AHFS) determines the best features followed by prediction of the objects by the improved CenterNet. In order to assess the object detection accuracy, the experiment was conducted using the MS-COCO dataset on average precision (AP), mean average precision (mAP), and average recall (AR) metrics. Our proffered method achieves 64.76% on the MS-COCO dataset, improving the accuracy by 2.5% compared to other state-of-the-art models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.