Deep learning is the mainstream technique for many machine learning tasks, including image recognition, machine translation, speech recognition, and so on. It has outperformed conventional methods in various fields and achieved great successes. Unfortunately, the understanding on how it works remains unclear. It has the central importance to lay down the theoretic foundation for deep learning.In this work, we give a geometric view to understand deep learning: we show that the fundamental principle attributing to the success is the manifold structure in data, namely natural high dimensional data concentrates close to a low-dimensional manifold, deep learning learns the manifold and the probability distribution on it.We further introduce the concepts of rectified linear complexity for deep neural network measuring its learning capability, rectified linear complexity of an embedding manifold describing the difficulty to be learned. Then we show for any deep neural network with fixed architecture, there exists a manifold that cannot be learned by the network. Finally, we propose to apply optimal mass transportation theory to control the probability distribution in the latent space.
Abstract-Specular reflection exists widely in photography and causes the recorded color deviating from its true value, so fast and high quality highlight removal from a single nature image is of great importance. In spite of the progress in the past decades in highlight removal, achieving wide applicability to the large diversity of nature scenes is quite challenging. To handle this problem, we propose an analytic solution to highlight removal based on an L2 chromaticity definition and corresponding dichromatic model. Specifically, this paper derives a normalized dichromatic model for the pixels with identical diffuse color: a unit circle equation of projection coefficients in two subspaces that are orthogonal to and parallel with the illumination, respectively. In the former illumination orthogonal subspace, which is specular-free, we can conduct robust clustering with an explicit criterion to determine the cluster number adaptively. In the latter illumination parallel subspace, a property called pure diffuse pixels distribution rule (PDDR) helps map each specular-influenced pixel to its diffuse component. In terms of efficiency, the proposed approach involves few complex calculation, and thus can remove highlight from high resolution images fast. Experiments show that this method is of superior performance in various challenging cases.Index Terms-Specular reflection, highlight removal, specular and diffuse, L2 normalized dichromatic model, adaptive material clustering.
Though generative adversarial networks (GANs) are prominent models to generate realistic and crisp images, they are unstable to train and suffer from the mode collapse/mixture. The problems of GANs come from approximating the intrinsic discontinuous distribution transform map with continuous DNNs. The recently proposed AE-OT model addresses the discontinuity problem by explicitly computing the discontinuous optimal transform map in the latent space of the autoencoder. Though have no mode collapse/mixture, the generated images by AE-OT are blurry. In this paper, we propose the AE-OT-GAN model to utilize the advantages of the both models: generate high quality images and at the same time overcome the mode collapse/mixture problems. Specifically, we firstly embed the low dimensional image manifold into the latent space by training an autoencoder (AE). Then the extended semi-discrete optimal transport (SDOT) map from the uniform distribution to the empirical latent distribution is used to generate new latent codes. Finally, our GAN model is trained to generate high quality images from the latent distribution induced by the extended SDOT map. The distribution transform map from this dataset related latent distribution to the data distribution will be continuous, and thus can be well approximated by the continuous DNNs. Additionally, the paired data between the latent codes and the real images gives us further restriction about the generator and stabilizes the training process. Experiments on simple MNIST dataset and complex datasets like CIFAR10 and CelebA show the advantages of the proposed method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.