Yutong Gao scite author profile

Yutong Gao

4Publications

15Citation Statements Received

0Citation Statements Given

How they've been cited

How they cite others

Affiliations

Beijing Jiaotong University, Shenyang University, Beijing University of Posts and Telecommunications

Publications

Order By: Most citations

End-to-End Text-to-Image Synthesis with Spatial Constrains

Wang

Lang

Liang

et al. 2020

ACM Trans. Intell. Syst. Technol.

View full text Add to dashboard Cite

Although the performance of automatically generating high-resolution realistic images from text descriptions has been significantly boosted, many challenging issues in image synthesis have not been fully investigated, due to shapes variations, viewpoint changes, pose changes, and the relations of multiple objects. In this article, we propose a novel end-to-end approach for text-to-image synthesis with spatial constraints by mining object spatial location and shape information. Instead of learning a hierarchical mapping from text to image, our algorithm directly generates multi-object fine-grained images through the guidance of the generated semantic layouts. By fusing text semantic and spatial information into a synthesis module and jointly fine-tuning them with multi-scale semantic layouts generated, the proposed networks show impressive performance in text-to-image synthesis for complex scenes. We evaluate our method both on single-object CUB dataset and multi-object MS-COCO dataset. Comprehensive experimental results demonstrate that our method significantly outperforms the state-of-the-art approaches consistently across different evaluation metrics.

show abstract

Fine-Grained Semantic Image Synthesis with Object-Attention Generative Adversarial Network

Wang

Lang

Liang

et al. 2021

ACM Trans. Intell. Syst. Technol.

View full text Add to dashboard Cite

Semantic image synthesis is a new rising and challenging vision problem accompanied by the recent promising advances in generative adversarial networks. The existing semantic image synthesis methods only consider the global information provided by the semantic segmentation mask, such as class label, global layout, and location, so the generative models cannot capture the rich local fine-grained information of the images (e.g., object structure, contour, and texture). To address this issue, we adopt a multi-scale feature fusion algorithm to refine the generated images by learning the fine-grained information of the local objects. We propose OA-GAN, a novel object-attention generative adversarial network that allows attention-driven, multi-fusion refinement for fine-grained semantic image synthesis. Specifically, the proposed model first generates multi-scale global image features and local object features, respectively, then the local object features are fused into the global image features to improve the correlation between the local and the global. In the process of feature fusion, the global image features and the local object features are fused through the channel-spatial-wise fusion block to learn ‘what’ and ‘where’ to attend in the channel and spatial axes, respectively. The fused features are used to construct correlation filters to obtain feature response maps to determine the locations, contours, and textures of the objects. Extensive quantitative and qualitative experiments on COCO-Stuff, ADE20K and Cityscapes datasets demonstrate that our OA-GAN significantly outperforms the state-of-the-art methods.

show abstract

The design of network fault diagnosis system based on PNN

Gao

Zhou

2010

View full text Add to dashboard Cite

MASCA: A Multimodal Abstractive Summarization Model Based on Core Words Fusion Attention

Yin

Sun

et al. 2022

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yutong Gao

End-to-End Text-to-Image Synthesis with Spatial Constrains

Fine-Grained Semantic Image Synthesis with Object-Attention Generative Adversarial Network

The design of network fault diagnosis system based on PNN

MASCA: A Multimodal Abstractive Summarization Model Based on Core Words Fusion Attention

Contact Info

Product

Resources

About