The appearance of generative adversarial networks (GAN) provides a new approach and framework for computer vision. Compared with traditional machine learning algorithms, GAN works via adversarial training concept and is more powerful in both feature learning and representation. GAN also exhibits some problems, such as non-convergence, model collapse, and uncontrollability due to high degree of freedom. How to improve the theory of GAN and apply it to computer-vision-related tasks have now attracted much research efforts. In this paper, recently proposed GAN models and their applications in computer vision are systematically reviewed. In particular, we firstly survey the history and development of generative algorithms, the mechanism of GAN, its fundamental network structures, and theoretical analysis of the original GAN. Classical GAN algorithms are then compared comprehensively in terms of the mechanism, visual results of generated samples, and Frechet Inception Distance. These networks are further evaluated from network construction, performance, and applicability aspects by extensive experiments conducted over public datasets. After that, several typical applications of GAN in computer vision, including high-quality samples generation, style transfer, and image translation, are examined. Finally, some existing problems of GAN are summarized and discussed and potential future research topics are forecasted. INDEX TERMS Deep learning, generative adversarial networks (GAN), computer vision (CV), image generation, style transfer, image inpainting.
Structured output prediction aims to learn a predictor to predict a structured output from a input data vector. The structured outputs include vector, tree, sequence, etc. We usually assume that we have a training set of input-output pairs to train the predictor. However, in many real-world applications, it is difficult to obtain the output for a input, thus for many training input data points, the structured outputs are missing. In this paper, we discuss how to learn from a training set composed of some input-output pairs, and some input data points without outputs. This problem is called semisupervised structured output prediction. We propose a novel method for this problem by constructing a nearest neighbor graph from the input space to present the manifold structure, and using it to regularize the structured output space directly. We define a slack structured output for each training data point, and proposed to predict it by learning a structured output predictor. The learning of both slack structured outputs and the predictor are unified within one single minimization problem. In this problem, we propose to minimize the structured loss between the slack structured outputs of neighboring data points, and the prediction error measured by the structured loss. The problem is optimized by an iterative algorithm. Experiment results over three benchmark data sets show its advantage.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.