Recent works have cast some light on the mystery of why deep nets fit any data and generalize despite being very overparametrized. This paper analyzes training and generalization for a simple 2-layer ReLU net with random initialization, and provides the following improvements over recent works: (i) Using a tighter characterization of training speed than recent papers, an explanation for why training a neural net with random labels leads to slower training, as originally observed in [Zhang et al. ICLR'17].(ii) Generalization bound independent of network size, using a data-dependent complexity measure.Our measure distinguishes clearly between random labels and true labels on MNIST and CIFAR, as shown by experiments. Moreover, recent papers require sample complexity to increase (slowly) with the size, while our sample complexity is completely independent of the network size.(iii) Learnability of a broad class of smooth functions by 2-layer ReLU nets trained via gradient descent.The key idea is to track dynamics of training and generalization via properties of a related kernel.
Modern deep learning methods provide an effective means to learn good representations. However, is a good representation itself sufficient for efficient reinforcement learning? This question is largely unexplored, and the extant body of literature mainly focuses on conditions which permit efficient reinforcement learning with little understanding of what are necessary conditions for efficient reinforcement learning. This work provides strong negative results for reinforcement learning methods with function approximation for which a good representation (feature extractor) is known to the agent, focusing on natural representational conditions relevant to value-based learning and policy-based learning. For value-based learning, we show that even if the agent has a highly accurate linear representation, the agent still needs to sample exponentially many trajectories in order to find a near-optimal policy. For policy-based learning, we show even if the agent's linear representation is capable of perfectly representing the optimal policy, the agent still needs to sample exponentially many trajectories in order to find a near-optimal policy. These lower bounds highlight the fact that having a good (value-based or policy-based) representation in and of itself is insufficient for efficient reinforcement learning. In particular, these results provide new insights into why the existing provably efficient reinforcement learning methods rely on further assumptions, which are often model-based in nature. Additionally, our lower bounds imply exponential separations in the sample complexity between 1) valuebased learning with perfect representation and value-based learning with a good-but-not-perfect representation, 2) value-based learning and policy-based learning, 3) policy-based learning and supervised learning and 4) reinforcement learning and imitation learning. * Part of this work was done while Simon S. Du was visiting Google Brain Princeton and Ruosong Wang was visiting Princeton University.
Energy is often the most constrained resource for battery-powered wireless devices and the lion's share of energy is often spent on transceiver usage (sending/receiving packets), not on computation. In this paper we study the energy complexity of Leader Election and Approximate Counting in several models of wireless radio networks. It turns out that energy complexity is very sensitive to whether the devices can generate random bits and their ability to detect collisions. We consider four collision-detection models: Strong-CD (in which transmitters and listeners detect collisions), Sender-CD and Receiver-CD (in which only transmitters or only listeners detect collisions), and No-CD (in which no one detects collisions.)The take-away message of our results is quite surprising. For randomized Leader Election algorithms, there is an exponential gap between the energy complexity of Sender-CD and Receiver-CD:Randomized: No-CD = Sender-CD Receiver-CD = Strong-CD and for deterministic Leader Election algorithms, there is another exponential gap in energy complexity, but in the reverse direction:In particular, the randomized energy complexity of Leader Election is Θ(log * n) in Sender-CD but Θ(log(log * n)) in Receiver-CD, where n is the (unknown) number of devices. Its deterministic complexity is Θ(log N ) in Receiver-CD but Θ(log log N ) in Sender-CD, where N is the (known) size of the devices' ID space.There is a tradeoff between time and energy. We give a new upper bound on the time-energy tradeoff curve for randomized Leader Election and Approximate Counting. A critical component of this algorithm is a new deterministic Leader Election algorithm for dense instances, when n = Θ(N ), with inverse-Ackermann-type (O(α(N ))) energy complexity.
ZnO@TiO2 nanorod arrays thin film coatings on an FTO substrate were made via a hydrothermal route. The coating exhibits good transparency and hydrophilic properties, exhibiting very good photocatalytic activity and durability.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.