Stacked Wasserstein Autoencoder

Xu, Wenju; Keshmiri, Shawn; Wang, Guanghui

doi:10.1016/j.neucom.2019.06.096

Cited by 18 publications

(6 citation statements)

References 25 publications

(28 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Deep neural networks have shown great potential in dealing with real-world tasks [11], [12], [13], [14], [15], [16], [17]. Many deep learning based methods were proposed for image content understanding [18], [19] and image content generation tasks [20], [21], [22].…”

Section: Related Workmentioning

confidence: 99%

A Domain Gap Aware Generative Adversarial Network for Multi-domain Image Translation

Xu,

Wang

2021

Preprint

Self Cite

View full text Add to dashboard Cite

Recent image-to-image translation models have shown great success in mapping local textures between two domains. Existing approaches rely on a cycle-consistency constraint that supervises the generators to learn an inverse mapping. However, learning the inverse mapping introduces extra trainable parameters and it is unable to learn the inverse mapping for some domains. As a result, they are ineffective in the scenarios where (i) multiple visual image domains are involved; (ii) both structure and texture transformations are required; and (iii) semantic consistency is preserved. To solve these challenges, the paper proposes a unified model to translate images across multiple domains with significant domain gaps. Unlike previous models that constrain the generators with the ubiquitous cycle-consistency constraint to achieve the content similarity, the proposed model employs a perceptual self-regularization constraint. With a single unified generator, the model can maintain consistency over the global shapes as well as the local texture information across multiple domains. Extensive qualitative and quantitative evaluations demonstrate the effectiveness and superior performance over state-of-the-art models. It is more effective in representing shape deformation in challenging mappings with significant dataset variation across multiple domains.

show abstract

Section: Related Workmentioning

confidence: 99%

A Domain Gap Aware Generative Adversarial Network for Multi-domain Image Translation

Xu,

Wang

2021

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…Image-to-image translation is a popular topic in computer vision [43], [44]. With the advent of Generative Adversarial Networks [15], it could be mainly categorized as supervised image-to-image translation and unsupervised image-to-image translation [1].…”

Section: Related Workmentioning

confidence: 99%

Six-channel Image Representation for Cross-domain Object Detection

Zhang,

Ma,

Wang

2021

Preprint

Self Cite

View full text Add to dashboard Cite

Most deep learning models are data-driven and the excellent performance is highly dependent on the abundant and diverse datasets. However, it is very hard to obtain and label the datasets of some specific scenes or applications. If we train the detector using the data from one domain, it cannot perform well on the data from another domain due to domain shift, which is one of the big challenges of most object detection models. To address this issue, some image-to-image translation techniques are employed to generate some fake data of some specific scenes to train the models. With the advent of Generative Adversarial Networks (GANs), we could realize unsupervised image-to-image translation in both directions from a source to a target domain and from the target to the source domain. In this study, we report a new approach to making use of the generated images. We propose to concatenate the original 3-channel images and their corresponding GAN-generated fake images to form 6-channel representations of the dataset, hoping to address the domain shift problem while exploiting the success of available detection models. The idea of augmented data representation may inspire further study on object detection and other applications.

show abstract

“…Extracting meaningful information from the environment is a challenging task [40,39,51]. In recent years, deep neural networks are becoming more and more popular for knowledge discovering in many computer vision tasks, such as object recognition [44,50], object detection [24,19], visual question answering [45], pose estimateion [17], image synthesis [42,41,43], face recognition [7], and depth estimation [15]. Object detection is the task of recognizing and localizing the objects in the images with the deep model trained on labelled ground truth [25].…”

Section: Related Workmentioning

confidence: 99%

Adaptively Denoising Proposal Collection for Weakly Supervised Object Localization

et al. 2019

Neural Process Lett

Self Cite

View full text Add to dashboard Cite

In this paper, we address the problem of weakly supervised object localization (WSL), which trains a detection network on the dataset with only image-level annotations. The proposed approach is built on the observation that the proposal set from the training dataset is a collection of background, object parts, and objects. Several strategies are taken to adaptively eliminate the noisy proposals and generate pseudo object-level annotations for the weakly labeled dataset. A multiple instance learning (MIL) algorithm enhanced by mask-out strategy is adopted to collect the class-specific object proposals, which are then utilized to adapt a pretrained classification network to a detection network. In addition, the detection results from the detection network are re-weighted by jointly considering the detection scores and the overlap ratio of proposals in a proposal subset optimization framework. The optimal proposals work as object-level labels that enable a pseudo-strongly supervised dataset for training the detection network. Consequently, we establish a fully adaptive detection network. Extensive evaluations on the PASCAL VOC 2007 and 2012 datasets demonstrate a significant improvement compared with the state-of-the-art methods.

show abstract

Stacked Wasserstein Autoencoder

Cited by 18 publications

References 25 publications

A Domain Gap Aware Generative Adversarial Network for Multi-domain Image Translation

A Domain Gap Aware Generative Adversarial Network for Multi-domain Image Translation

Six-channel Image Representation for Cross-domain Object Detection

Adaptively Denoising Proposal Collection for Weakly Supervised Object Localization

Contact Info

Product

Resources

About