Recent methods for learning vector space representations of words have succeeded in capturing fine-grained semantic and syntactic regularities using vector arithmetic, but the origin of these regularities has remained opaque. We analyze and make explicit the model properties needed for such regularities to emerge in word vectors. The result is a new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods. Our model efficiently leverages statistical information by training only on the nonzero elements in a word-word cooccurrence matrix, rather than on the entire sparse matrix or on individual context windows in a large corpus. The model produces a vector space with meaningful substructure, as evidenced by its performance of 75% on a recent word analogy task. It also outperforms related models on similarity tasks and named entity recognition.
A longstanding goal in deep learning research has been to precisely characterize training and generalization. However, the often complex loss landscapes of neural networks (NNs) have made a theory of learning dynamics elusive. In this work, we show that for wide NNs the learning dynamics simplify considerably and that, in the infinite width limit, they are governed by a linear model obtained from the first-order Taylor expansion of the network around its initial parameters. Furthermore, mirroring the correspondence between wide Bayesian NNs and Gaussian processes (GPs), gradient-based training of wide NNs with a squared loss produces test set predictions drawn from a GP with a particular compositional kernel. While these theoretical results are only exact in the infinite width limit, we nevertheless find excellent empirical agreement between the predictions of the original network and those of the linearized version even for finite practically-sized networks. This agreement is robust across different architectures, optimization methods, and loss functions.
We present the three-loop remainder function, which describes the scattering of six gluons in the maximally-helicity-violating configuration in planar N = 4 superYang-Mills theory, as a function of the three dual conformal cross ratios. The result can be expressed in terms of multiple Goncharov polylogarithms. We also employ a more restricted class of hexagon functions which have the correct branch cuts and certain other restrictions on their symbols. We classify all the hexagon functions through transcendental weight five, using the coproduct for their Hopf algebra iteratively, which amounts to a set of first-order differential equations. The three-loop remainder function is a particular weight-six hexagon function, whose symbol was determined previously. The differential equations can be integrated numerically for generic values of the cross ratios, or analytically in certain kinematic limits, including the near-collinear and multi-Regge limits. These limits allow us to impose constraints from the operator product expansion and multiRegge factorization directly at the function level, and thereby to fix uniquely a set of Riemann ζ valued constants that could not be fixed at the level of the symbol. The nearcollinear limits agree precisely with recent predictions by Basso, Sever and Vieira based on integrability. The multi-Regge limits agree with the factorization formula of Fadin and Lipatov, and determine three constants entering the impact factor at this order. We plot the three-loop remainder function for various slices of the Euclidean region of positive cross ratios, and compare it to the two-loop one. For large ranges of the cross ratios, the ratio of the three-loop to the two-loop remainder function is relatively constant, and close to −7.
We present the four-loop remainder function for six-gluon scattering with maximal helicity violation in planar N = 4 super-Yang-Mills theory, as an analytic function of three dual-conformal cross ratios. The function is constructed entirely from its analytic properties, without ever inspecting any multi-loop integrand. We employ the same approach used at three loops, writing an ansatz in terms of hexagon functions, and fixing coefficients in the ansatz using the multi-Regge limit and the operator product expansion in the near-collinear limit. We express the result in terms of multiple polylogarithms, and in terms of the coproduct for the associated Hopf algebra. From the remainder function, we extract the BFKL eigenvalue at next-to-next-to-leading logarithmic accuracy (NNLLA), and the impact factor at N 3 LLA. We plot the remainder function along various lines and on one surface, studying ratios of successive loop orders. As seen previously through three loops, these ratios are surprisingly constant over large regions in the space of cross ratios, and they are not far from the value expected at asymptotically large orders of perturbation theory.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.