Nick Ryder scite author profile

We introduce Codex, a GPT language model finetuned on publicly available code from GitHub, and study its Python code-writing capabilities. A distinct production version of Codex powers GitHub Copilot. On HumanEval, a new evaluation set we release to measure functional correctness for synthesizing programs from docstrings, our model solves 28.8% of the problems, while GPT-3 solves 0% and GPT-J solves 11.4%. Furthermore, we find that repeated sampling from the model is a surprisingly effective strategy for producing working solutions to difficult prompts. Using this method, we solve 70.2% of our problems with 100 samples per problem. Careful investigation of our model reveals its limitations, including difficulty with docstrings describing long chains of operations and with binding operations to variables. Finally, we discuss the potential broader impacts of deploying powerful code generation technologies, covering safety, security, and economics.

show abstract

Scaling Laws for Autoregressive Generative Modeling

Henighan¹,

Kaplan²,

Katz³

et al. 2020

Preprint

View full text Add to dashboard Cite

We identify empirical scaling laws for the cross-entropy loss in four domains: generative image modeling, video modeling, multimodal image↔text models, and mathematical problem solving. In all cases autoregressive Transformers smoothly improve in performance as model size and compute budgets increase, following a power-law plus constant scaling law. The optimal model size also depends on the compute budget through a power-law, with exponents that are nearly universal across all data domains. The cross-entropy loss has an information theoretic interpretation as S(True) + D KL (True||Model), and the empirical scaling laws suggest a prediction for both the true data distribution's entropy and the KL divergence between the true and model distributions. With this interpretation, billion-parameter Transformers are nearly perfect models of the YFCC100M image distribution downsampled to an 8 × 8 resolution, and we can forecast the model size needed to achieve any given reducible loss (ie D KL ) in nats/image for other resolutions. We find a number of additional scaling laws in specific domains: (a) we identify a scaling relation for the mutual information between captions and images in multimodal models, and show how to answer the question "Is a picture worth a thousand words?"; (b) in the case of mathematical problem solving, we identify scaling laws for model performance when extrapolating beyond the training distribution; (c) we finetune generative image models for ImageNet classification and find smooth scaling of the classification loss and error rate, even as the generative loss levels off. Taken together, these results strengthen the case that scaling laws have important implications for neural network performance, including on downstream tasks.

show abstract

The geometry of rank decompositions of matrix multiplication II: 3 × 3 matrices

Ballard

Ikenmeyer

Landsberg

et al. 2019

Journal of Pure and Applied Algebra

View full text Add to dashboard Cite

This is the second in a series of papers on rank decompositions of the matrix multiplication tensor. We present new rank 23 decompositions for the 3×3 matrix multiplication tensor M ⟨3⟩ . All our decompositions have symmetry groups that include the standard cyclic permutation of factors but otherwise exhibit a range of behavior. One of them has 11 cubes as summands and admits an unexpected symmetry group of order 12.We establish basic information regarding symmetry groups of decompositions and outline two approaches for finding new rank decompositions of M ⟨n⟩ for larger n.PO Box 7311,

show abstract

Generalizations of the Matching Polynomial to the Multivariate Independence Polynomial

Leake¹,

Ryder²

2019

View full text Add to dashboard Cite

We generalize two main theorems of matching polynomials of undirected simple graphs, namely, real-rootedness and the Heilmann-Lieb root bound. Viewing the matching polynomial of a graph G as the independence polynomial of the line graph of G, we determine conditions for the extension of these theorems to the independence polynomial of any graph. In particular, we show that a stability-like property of the multivariate independence polynomial characterizes claw-freeness. Finally, we give and extend multivariate versions of Godsil's theorems on the divisibility of matching polynomials of trees related to G.

show abstract

Exponential Lower Bounds on Spectrahedral Representations of Hyperbolicity Cones

Raghavendra

Ryder

Srivastava

et al. 2019

View full text Add to dashboard Cite

The Generalized Lax Conjecture asks whether every hyperbolicity cone is a section of a semidefinite cone of sufficiently high dimension. We prove that the space of hyperbolicity cones of hyperbolic polynomials of degree d in n variables contains (n/d) Ω(d) pairwise distant cones in the Hausdorff metric, and therefore that any semidefinite representation of such polynomials must have dimension at least (n/d) Ω(d) (even allowing a small approximation error). The cones are perturbations of the hyperbolicity cones of elementary symmetric polynomials. Our proof contains several ingredients of independent interest, including the identification of a large subspace in which the elementary symmetric polynomials lie in the relative interior of the set of hyperbolic polynomials, and a quantitative generalization of the fact that a real-rooted polynomial with two consecutive zero coefficients must have a high multiplicity root at zero.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Nick Ryder

Evaluating Large Language Models Trained on Code

Scaling Laws for Autoregressive Generative Modeling

The geometry of rank decompositions of matrix multiplication II: 3 × 3 matrices

Generalizations of the Matching Polynomial to the Multivariate Independence Polynomial

Exponential Lower Bounds on Spectrahedral Representations of Hyperbolicity Cones

Contact Info

Product

Resources

About

Nick Ryder

Evaluating Large Language Models Trained on Code

Scaling Laws for Autoregressive Generative Modeling

The geometry of rank decompositions of matrix multiplication II: 3 × 3 matrices

Generalizations of the Matching Polynomial to the Multivariate Independence Polynomial

Exponential Lower Bounds on Spectrahedral Representations of Hyperbolicity Cones

Contact Info

Product

Resources

About

The geometry of rank decompositions of matrix multiplication II: 3 × 3 matrices