How Good Is My GAN?

Shmelkov, Konstantin; Schmid, Cordelia; Alahari, Karteek

doi:10.1007/978-3-030-01216-8_14

Cited by 320 publications

(213 citation statements)

References 27 publications

Supporting

Mentioning

211

Contrasting

Order By: Relevance

“…The result is then compared with the score of the same classifier trained on the real training data mixed with noise. Along the same line, recently, Shmelkov et al [25] proposed to compare class-conditional GANs with GAN-train and GAN-test scores using a neural net classifier. GAN-train is a network trained on GAN generated images and is evaluated on real-world images.…”

Section: Mode Drop and Collapsementioning

confidence: 99%

Pros and cons of GAN evaluation measures

Borji¹

2019

Computer Vision and Image Understanding

720

438

View full text Add to dashboard Cite

Generative models, in particular generative adversarial networks (GANs), have gained significant attention in recent years. A number of GAN variants have been proposed and have been utilized in many applications. Despite large strides in terms of theoretical progress, evaluating and comparing GANs remains a daunting task. While several measures have been introduced, as of yet, there is no consensus as to which measure best captures strengths and limitations of models and should be used for fair model comparison. As in other areas of computer vision and machine learning, it is critical to settle on one or few good measures to steer the progress in this field. In this paper, I review and critically discuss more than 24 quantitative and 5 qualitative measures for evaluating generative models with a particular emphasis on GAN-derived models. I also provide a set of 7 desiderata followed by an evaluation of whether a given measure or a family of measures is compatible with them.

show abstract

Section: Mode Drop and Collapsementioning

confidence: 99%

Pros and cons of GAN evaluation measures

Borji¹

2019

Computer Vision and Image Understanding

720

438

View full text Add to dashboard Cite

show abstract

“…More recently, this was adapted as a confidence estimator to predicting segmentation performance in the clinical domain (Valindria et al, 2017). More recently the same idea has been used recently used for evaluating GANs for image generation (Shmelkov et al, 2018).…”

Section: Related Workmentioning

confidence: 99%

To Annotate or Not? Predicting Performance Drop under Domain Shift

Elsahar¹,

Gallé²

2019

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen

View full text Add to dashboard Cite

Performance drop due to domain-shift is an endemic problem for NLP models in production. This problem creates an urge to continuously annotate evaluation datasets to measure the expected drop in the model performance which can be prohibitively expensive and slow. In this paper we study the problem of predicting the performance drop of modern NLP models under domain-shift, in the absence of any target domain labels. We investigate three families of methods (H-divergence, reverse classification accuracy and confidence measures), show how they can be used to predict the performance drop and study their robustness to adversarial domain-shifts. Our results on sentiment classification and sequence labeling show that our method is able to predict performance drops with an error rate as low as 2.15% and 0.89% for sentiment analysis and POS tagging respectively.

show abstract

“…Furthermore, although there is no guaranteed correlation between the generated samples quality and the accuracy improvement, MTCM-GAN can also be evaluated with generation metrics. Considering the Frechet inception distance [21] as the metric, MTCM-GAN generator produced samples with a score of 11.64 at the end of the training. As illustrated in Fig.…”

Section: Quantitative Results (1)mentioning

confidence: 99%

Understanding Natural Language Instructions for Fetching Daily Objects Using GAN-Based Multimodal Target–Source Classification

Magassouba

Sugiura

Quoc

et al. 2019

IEEE Robot. Autom. Lett.

View full text Add to dashboard Cite

In this paper, we address multimodal language understanding with unconstrained fetching instruction for domestic service robots. A typical fetching instruction such as "Bring me the yellow toy from the white shelf" requires to infer the user intention, i.e., what object (target) to fetch and from where (source). To solve the task, we propose a Multimodal Target-source Classifier Model (MTCM), which predicts the region-wise likelihood of target and source candidates in the scene. Unlike other methods, MTCM can handle regionwise classification based on linguistic and visual features. We evaluated our approach that outperformed the state-of-the-art method on a standard data set. We also extended MTCM with Generative Adversarial Nets (MTCM-GAN), and enabled simultaneous data augmentation and classification.

show abstract

How Good Is My GAN?

Cited by 320 publications

References 27 publications

Pros and cons of GAN evaluation measures

Pros and cons of GAN evaluation measures

To Annotate or Not? Predicting Performance Drop under Domain Shift

Understanding Natural Language Instructions for Fetching Daily Objects Using GAN-Based Multimodal Target–Source Classification

Contact Info

Product

Resources

About