To train deep neural networks effectively, a lot of labeled data is typically needed. However, real-time applications make it difficult and expensive to acquire high-quality labels for the data because it takes skill and knowledge to accurately annotate multiple label images. In order to enhance classification performance, it is also crucial to extract image features from all potential objects of various sizes as well as the relationships between labels of numerous label images. The current approaches fall short in their ability to map the label dependencies and effectively classify the labels. They also perform poor to label the unlabeled images when small amount of labeled images available for classification. In order to solve these issues, we suggest a new framework for semi-supervised multiple object label classification using multi-stage Convolutional neural networks with visual attention (MSCNN)and GCN for label co-occurrence embedding(LCE) (MSCNN-LCE-MIC), which combines GCN and attention mechanism to concurrently capture local and global label dependencies throughout the entire image classification process. Four main modules make up MSCNN-LCE-MIC: (1) improved multi-label propagation method for labeling largely available unlabeled image;(2) a feature extraction module using multistage CNN with visual attention mechanism that focuses on the connections between labels and target regions to extract accurate features from each input image; (3) a label co-existence learning that applies GCN to discover the associations between different items to create embeddings of label co-occurrence; and (4) an integrated multi-modal fusion module. Numerous tests on MS-COCO and PASCAL VOC2007 show that MSCNN-LCE-MIC significantly improves classification efficiency on mAP 84.3% and 95.8% respectively when compared to the most recent existing methods.
Deep Convolutional Neural Network (CNN) classification of single-object image labels has shown high efficiency. However, the great bulk of actual application data comprises of multiple label object images that belong to a variety of scenes, objects, and actions in a single image. Most of the recent research studies on multiple object label classification rely on individual classifiers for each label category and use probability ranking for final classification. These methods already in place work better, but they cannot find the dependencies between multiple labels in an image. In this paper, we use deep CNN architecture and long short-term memory (LSTM) to solve the problem of label dependence. Our proposed CNN-LSTM methodology learns the embeddings of object label to depict semantic object label dependence and image label association using a robust multi-label classifier cost function (RMLC), which is a ramp loss function.The feature extraction is carried out by a convolution neural network (CNN) pipeline; whereas multi-object label correlation is identified by LSTM using object labels and features extracted from input images. We use the loss function to make sure that correlated labels and corresponding features map close to each other, limiting the high value updation of weights for the images with improper labels, and the object label prediction progresses every time, which helps to improve the multi-label learning task. Experiments conducted using the proposed framework on benchmark visual recognition datasets such as CIFAR-10, STL-10, PASCAL VOC 2007, MS-COCO, and NUS-WIDE provided performance comparatively better than many existing methods in terms of accuracy and mean average precision. The CNN-LSTM + RMLC achieve the best test accuracy of 95.56 % on the STL dataset, which is 4% higher than the existing method, and the best mean average precision (mAP) of 82.6 on the MS-COCO dataset, which shows the feasibility and usefulness of our suggested framework on multiple label image classification.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.