Yanzhao Xie scite author profile

Yanzhao Xie

4Publications

37Citation Statements Received

56Citation Statements Given

How they've been cited

How they cite others

Affiliations

Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Lanzhou University of Technology

Publications

Order By: Most citations

Fast Graph Convolution Network Based Multi-label Image Recognition via Cross-modal Fusion

Wang

Xie

Liu

et al. 2020

View full text Add to dashboard Cite

In multi-label image recognition, it has become a popular method to predict those labels that co-occur in an image via modeling the label dependencies. Previous works focus on capturing the correlation between labels, but neglect to effectively fuse the image features and label embeddings, which severely affects the convergence efficiency of the model and inhibits the further precision improvement of multi-label image recognition. To overcome this shortcoming, in this paper, we introduce Multi-modal Factorized Bilinear pooling (MFB) which works as an efficient component to fuse cross-modal embeddings and propose F-GCN, a fast graph convolution network (GCN) based multi-label image recognition model. F-GCN consists of three key modules: (1) an image representation learning module which adopts a convolution neural network (CNN) to learn and generate image representations, (2) a label co-occurrence embedding module which first obtains the label vectors via the word embeddings technique and then adopts GCN to capture label co-occurrence embeddings and (3) an MFB fusion module which efficiently fuses these cross-modal vectors to enable an end-to-end model with a multi-label loss function. We conduct extensive experiments on two multi-label datasets including MS-COCO and VOC2007. Experimental results demonstrate the MFB component efficiently fuses image representations and label co-occurrence embeddings and thus greatly improves the convergence efficiency of the model. In addition, the performance of image recognition has also been promoted compared with the state-of-the-art methods. CCS CONCEPTS • Computing methodologies → Image representations.

show abstract

Label-Attended Hashing for Multi-Label Image Retrieval

Xie

Liu

Wang

et al. 2020

View full text Add to dashboard Cite

For the multi-label image retrieval, the existing hashing algorithms neglect the dependency between objects and thus fail to capture the attention information in the feature extraction, which affects the precision of hash codes. To address this problem, we explore the inter-dependency between objects through their co-occurrence correlation from the label set and adopt Multi-modal Factorized Bilinear (MFB) pooling component so that the image representation learning can capture this attention information. We propose a Label-Attended Hashing (LAH) algorithm which enables an end-to-end hash model with inter-dependency feature extraction. LAH first combines Convolutional Neural Network (CNN) and Graph Convolution Network (GCN) to separately generate the image representation and label co-occurrence embeddings, then adopts MFB to fuse these two modal vectors, finally learns the hash function with a Cauchy distribution based loss function via back propagation. Extensive experiments on public multi-label datasets demonstrate that (1) LAH can achieve the state-of-the-art retrieval results and (2) the usage of co-occurrence relationship and MFB not only promotes the precision of hash codes but also accelerates the hash learning. GitHub address: https://github.com/IDSM-AI/LAH.

show abstract

An intelligent hybrid model for power flow optimization in the cloud-IOT electrical distribution network

2017

View full text Add to dashboard Cite

Cross-modal fusion for multi-label image classification with attention mechanism

Wang

Xie

Zeng

et al. 2022

Computers and Electrical Engineering

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yanzhao Xie

Fast Graph Convolution Network Based Multi-label Image Recognition via Cross-modal Fusion

Label-Attended Hashing for Multi-Label Image Retrieval

An intelligent hybrid model for power flow optimization in the cloud-IOT electrical distribution network

Cross-modal fusion for multi-label image classification with attention mechanism

Contact Info

Product

Resources

About