Cross-Modal Object Detection Based on a Knowledge Update

Gao, Yueqing; Zhou, Huachun; Chen, Lulu; Shen, Y.; Guo, Ce; Zhang, Xinyu

doi:10.3390/s22041338

Cited by 2 publications

(3 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Several studies have developed diverse deep cross-modal representation learning methods that utilize the neural networks to extract image and text features and build relationships between different modalities. For example, Gao et al [ 27 ] proposed an image encoder, text encoder, and multi-modal encoder to extract text features and image features and mine rich feature information. Viviana et al [ 28 ] used convolutional neural networks to deeply extract features from images and text in the stage of establishing connections between different modal data.…”

Section: Related Workmentioning

confidence: 99%

Hybrid DAER Based Cross-Modal Retrieval Exploiting Deep Representation Learning

Huang

2023

Entropy

View full text Add to dashboard Cite

Information retrieval across multiple modes has attracted much attention from academics and practitioners. One key challenge of cross-modal retrieval is to eliminate the heterogeneous gap between different patterns. Most of the existing methods tend to jointly construct a common subspace. However, very little attention has been given to the study of the importance of different fine-grained regions of various modalities. This lack of consideration significantly influences the utilization of the extracted information of multiple modalities. Therefore, this study proposes a novel text-image cross-modal retrieval approach that constructs a dual attention network and an enhanced relation network (DAER). More specifically, the dual attention network tends to precisely extract fine-grained weight information from text and images, while the enhanced relation network is used to expand the differences between different categories of data in order to improve the computational accuracy of similarity. The comprehensive experimental results on three widely-used major datasets (i.e., Wikipedia, Pascal Sentence, and XMediaNet) show that our proposed approach is effective and superior to existing cross-modal retrieval methods.

show abstract

Section: Related Workmentioning

confidence: 99%

Hybrid DAER Based Cross-Modal Retrieval Exploiting Deep Representation Learning

Huang

2023

Entropy

View full text Add to dashboard Cite

show abstract

“…Target detection technology has a very important role in the field of machine vision and practical applications in life, so the automatic detection of targets is an important research task 24–27 . The performance of target detection can be seriously affected by complex background, target occlusion, noise interference, low resolution, scale and pose variation and so forth 28 . Among the many instruments in substations, the appearance and expression of instruments with different functions are very different, but there are similar characteristics for different categories of industrial instruments 29–32 .…”

Section: Introductionmentioning

confidence: 99%

“…[24][25][26][27] The performance of target detection can be seriously affected by complex background, target occlusion, noise interference, low resolution, scale and pose variation and so forth. 28 Among the many instruments in substations, the appearance and expression of instruments with different functions are very different, but there are similar characteristics for different categories of industrial instruments. [29][30][31][32] At the same time, when acquiring the image information of substation meters, the different acquisition methods of substation inspection robots will also lead to some different types of interference information, such as picture noise, which will affect the accuracy of detection results to a certain extent.…”

Section: Introductionmentioning

confidence: 99%

Substation instrumentation target detection based on multi‐scale feature fusion

Feng

Huang

Sun

et al. 2022

Concurrency and Computation

View full text Add to dashboard Cite

SUMMARY With the promotion of smart grid construction work, the use of high‐precision and high‐efficiency substation inspection robot has become the development trend of substation inspection. A multi‐scale feature fusion meter target detection algorithm is proposed to address the problems of low efficiency and susceptibility to surrounding environmental factors by the traditional manual meter reading method. Kinecct is used to acquire color images of substation meters with different backgrounds, light intensities, and angles to build a substation meter dataset. Based on the complementarity and correlation of multi‐scale features, an SSD target detection model with multi‐scale feature fusion is established, and the performance of the algorithm is tested on the constructed dataset, and comparative experiments are conducted to verify the effectiveness of the algorithm for target detection accuracy improvement.

show abstract

Cross-Modal Object Detection Based on a Knowledge Update

Cited by 2 publications

References 23 publications

Hybrid DAER Based Cross-Modal Retrieval Exploiting Deep Representation Learning

Hybrid DAER Based Cross-Modal Retrieval Exploiting Deep Representation Learning

Substation instrumentation target detection based on multi‐scale feature fusion

Contact Info

Product

Resources

About