We investigate opportunities and challenges for improving unsupervised machine learning using four common strategies with a long history in physics: divide-and-conquer, Occam's razor, unification and lifelong learning. Instead of using one model to learn everything, we propose a novel paradigm centered around the learning and manipulation of theories, which parsimoniously predict both aspects of the future (from past observations) and the domain in which these predictions are accurate. Specifically, we propose a novel generalized-mean-loss to encourage each theory to specialize in its comparatively advantageous domain, and a differentiable description length objective to downweight bad data and "snap" learned theories into simple symbolic formulas. Theories are stored in a "theory hub", which continuously unifies learned theories and can propose theories when encountering new environments. We test our implementation, the toy "AI Physicist" learning agent, on a suite of increasingly complex physics environments. From unsupervised observation of trajectories through worlds involving random combinations of gravity, electromagnetism, harmonic motion and elastic bounces, our agent typically learns faster and produces mean-squared prediction errors about a billion times smaller than a standard feedforward neural net of comparable complexity, typically recovering integer and rational theory parameters exactly. Our agent successfully identifies domains with different laws of motion also for a nonlinear chaotic double pendulum in a piecewise constant force field.
The theory to calculate circular dichroism (CD) of chiral molecules in a finite cluster with arbitrarily disposed objects has been developed by means of T-matrix method. The interactions between chiral molecules and nanostructures have been investigated. Our studies focus on the case of chiral molecules inserted into plasmonic hot spots of nanostructures. Our results show that the total CD of the system with two chiral molecules is not sum for two cases when two chiral molecules inserted respectively into the hot spots of nanoparticle clusters as the distances among nanoparticles are small, although the relationship is established at the case of large interparticle distances. The plasmonic CD arising from structure chirality of nanocomposites depends strongly on the relative positions and orientations of nanospheroids, and are much greater than that from molecule-induced chirality. However, the molecule-induced plasmonic CD effect from the molecule-NP nanocomposites with special chiral structures can be spectrally distinguishable from the structure chirality-based optical activity. Our results provide a new theoretical framework for understanding the two different aspects of plasmonic CD effect in molecule-NP nanocomposites, which would be helpful for the experimental design of novel biosensors to realize ultrasensitive probe of chiral information of molecules by plasmon-based nanotechnology.
The Information Bottleneck (IB) method (Tishby et al. (2000)) provides an insightful and principled approach for balancing compression and prediction for representation learning. The IB objective I(X; Z) − βI(Y ; Z) employs a Lagrange multiplier β to tune this trade-off. However, in practice, not only is β chosen empirically without theoretical guidance, there is also a lack of theoretical understanding between β, learnability, the intrinsic nature of the dataset and model capacity. In this paper, we show that if β is improperly chosen, learning cannot happen -the trivial representation P (Z|X) = P (Z) becomes the global minimum of the IB objective. We show how this can be avoided, by identifying a sharp phase transition between the unlearnable and the learnable which arises as β is varied. This phase transition defines the concept of IB-Learnability. We prove several sufficient conditions for IB-Learnability, which provides theoretical guidance for choosing a good β. We further show that IB-learnability is determined by the largest confident, typical, and imbalanced subset of the examples (the conspicuous subset), and discuss its relation with model capacity. We give practical algorithms to estimate the minimum β for a given dataset. We also empirically demonstrate our theoretical conditions with analyses of synthetic datasets, MNIST, and CIFAR10.• We show that improperly chosen β may result in a failure to learn: the trivial solution P (Z|X) = P (Z) becomes the global minimum of the IB objective, even for β 1 (Section 1.1).• We introduce the concept of IB-Learnability, and show that when we vary β, the IB objective will un-arXiv:1907.07331v1 [cs.LG]
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.