Artificial Neural Networks were recently shown to be an efficient representation of highly-entangled many-body quantum states. In practical applications, neural-network states inherit numerical schemes used in Variational Monte Carlo, most notably the use of Markov-Chain Monte-Carlo (MCMC) sampling to estimate quantum expectations. The local stochastic sampling in MCMC caps the potential advantages of neural networks in two ways: (i) Its intrinsic computational cost sets stringent practical limits on the width and depth of the networks, and therefore limits their expressive capacity; (ii) Its difficulty in generating precise and uncorrelated samples can result in estimations of observables that are very far from their true value. Inspired by the state-of-the-art generative models used in machine learning, we propose a specialized Neural Network architecture that supports efficient and exact sampling, completely circumventing the need for Markov Chain sampling. We demonstrate our approach for two-dimensional interacting spin models, showcasing the ability to obtain accurate results on larger system sizes than those currently accessible to neural-network quantum states.
Modern deep learning has enabled unprecedented achievements in various domains. Nonetheless, employment of machine learning for wave function representations is focused on more traditional architectures such as restricted Boltzmann machines (RBMs) and fully-connected neural networks. In this letter, we establish that contemporary deep learning architectures, in the form of deep convolutional and recurrent networks, can efficiently represent highly entangled quantum systems. By constructing Tensor Network equivalents of these architectures, we identify an inherent re-use of information in the network operation as a key trait which distinguishes them from standard Tensor Network based representations, and which enhances their entanglement capacity. Our results show that such architectures can support volume-law entanglement scaling, polynomially more efficiently than presently employed RBMs. Thus, beyond a quantification of the entanglement capacity of leading deep learning architectures, our analysis formally motivates a shift of trending neural-network based wave function representations closer to the state-of-the-art in machine learning.Introduction.-Many-body physics and machine learning are distinct scientific disciplines, however they share a common need for efficient representations of highly expressive multivariate function classes. In the former, the function class of interest captures the entanglement properties of examined many-body quantum systems, and in the latter, it describes the dependencies required for performing modern machine learning tasks.A prominent approach for classically simulating manybody wave functions makes use of their entanglement properties in order to construct Tensor Network (TN) architectures that aptly model them in the thermodynamic limit [1][2][3][4][5][6][7][8]. Though this method is successful in modeling one-dimensional (1D) systems that obey area-law entanglement scaling with sub-system size [9] through the Matrix Product State (MPS) TN [1,2], it still faces difficulties in modeling two-dimensional (2D) systems due to intractability [3,10].In the seemingly unrelated field of machine learning, deep network architectures have exhibited an unprecedented ability to tractably encompass the convoluted dependencies that characterize difficult learning tasks such as image classification or speech recognition [11][12][13][14][15][16][17][18]. A consequent machine learning inspired approach for modeling wave functions makes use of fullyconnected neural-networks and restricted Boltzmann machines (RBMs) [19][20][21][22][23][24][25], which represent relatively veteran machine learning constructs.In this letter, we formally establish that highly entangled many-body wave functions can be efficiently represented by deep learning architectures that are at the forefront of recent empirical successes. Specifically, we address two prominent architectures in the form of convolutional neural networks (CNNs), commonly used over spatial inputs (e.g. image pixels [11]), and recurrent neural netwo...
The ability to learn from large unlabeled corpora has allowed neural language models to advance the frontier in natural language understanding. However, existing self-supervision techniques operate at the word form level, which serves as a surrogate for the underlying semantic content. This paper proposes a method to employ weak-supervision directly at the word sense level. Our model, named SenseBERT, is pre-trained to predict not only the masked words but also their WordNet supersenses. Accordingly, we attain a lexicalsemantic level language model, without the use of human annotation. SenseBERT achieves significantly improved lexical understanding, as we demonstrate by experimenting on SemEval Word Sense Disambiguation, and by attaining a state of the art result on the 'Word in Context' task.
Deep convolutional networks have witnessed unprecedented success in various machine learning applications. Formal understanding on what makes these networks so successful is gradually unfolding, but for the most part there are still significant mysteries to unravel. The inductive bias, which reflects prior knowledge embedded in the network architecture, is one of them. In this work, we establish a fundamental connection between the fields of quantum physics and deep learning. We use this connection for asserting novel theoretical observations regarding the role that the number of channels in each layer of the convolutional network fulfills in the overall inductive bias. Specifically, we show an equivalence between the function realized by a deep convolutional arithmetic circuit (ConvAC) and a quantum many-body wave function, which relies on their common underlying tensorial structure. This facilitates the use of quantum entanglement measures as welldefined quantifiers of a deep network's expressive ability to model intricate correlation structures of its inputs. Most importantly, the construction of a deep convolutional arithmetic circuit in terms of a Tensor Network is made available. This description enables us to carry a graph-theoretic analysis of a convolutional network, tying its expressiveness to a min-cut in the graph which characterizes it. Thus, we demonstrate a direct control over the inductive bias of the designed deep convolutional network via its channel numbers, which we show to be related to the min-cut in the underlying graph. This result is relevant to any practitioner designing a convolutional network for a specific task. We theoretically analyze convolutional arithmetic circuits, and empirically validate our findings on more common convolutional networks which involve ReLU activations and max pooling. Beyond the results described above, the description of a deep convolutional network in well-defined graph-theoretic tools and the formal structural connection to quantum entanglement, are two interdisciplinary bridges that are brought forth by this work.
The realization of topological superconductors (SCs) in one or two dimensions is a highly pursued goal. Prominent proposed realization schemes include semiconductor/superconductor heterostructures and set stringent constraints on the chemical potential of the system. However, the ability to keep the chemical potential in the required range while in the presence of an adjacent SC and its accompanied screening effects, is a great experimental challenge. In this work, we study a SC lattice structure in which the SC is deposited periodically on a one-or two-dimensional sample. We demonstrate that this realization platform overcomes the challenge of controlling the chemical potential in the presence of the superconductor's electrostatic screening. We show how Majorana bound states emerge at the ends of a one-dimensional system proximity coupled to a one-dimensional SC lattice, and move on to present a SC-lattice-based realization of the two-dimensional px + ipy SC, hosting chiral Majorana modes at its edges. In particular, we establish that even when assuming the worst case of absolute screening, in which the chemical potential under the SC is completely unaffected by the external gate potential, the topological phase can be reached by tuning the chemical potential in the area not covered by the SC. Finally, we briefly discuss possible effects of Coulomb blockade on the properties of the system.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.