Efficient Training of Language Models to Fill in the Middle

Bavarian, Mohammad; Jun, Heewoo; Tezak, Nikolas; John, Sabu; McLeavey, Christine; Tworek, Jerry; Chen, Mark

doi:10.48550/arxiv.2207.14255

Cited by 19 publications

(20 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…There is a strong correlation between the model parameter count and accuracy, 25 so we focus only on the largest models with more than 1B parameters. The architectures of models are all decoder-only like with the ability to insert completions, 26 (except when noted). The rst model is a GPT-3 12B ne-tuned on code (Codex) abbreviated as "cushman".…”

Section: Methodsmentioning

confidence: 99%

Assessment of chemistry knowledge in large language models that generate code

et al. 2023

View full text Add to dashboard Cite

show abstract

Section: Methodsmentioning

confidence: 99%

Assessment of chemistry knowledge in large language models that generate code

et al. 2023

View full text Add to dashboard Cite

show abstract

“…There is a strong correlation between model parameter count and accuracy, 24 so we focus only on the largest models with more than 1B parameters. The architectures of models are all decoder-only like GPT-3 3 with the ability to insert completions, 25 (except when noted). The first model is a GPT-3 12B fine-tuned on code (Codex) abbreviated as "cushman."…”

Section: Methodsmentioning

confidence: 99%

Assessment of chemistry knowledge in large language models that generate code

White

Hocky

Gandhi

et al. 2022

Preprint

View full text Add to dashboard Cite

In this work, we investigate the question: do code-generating large language models know chemistry? Our results indicate, mostly yes. To evaluate this, we produce a benchmark set of problems, and evaluate these models based on correctness of code by automated testing and evaluation by experts. We find recent LLMs are able to write correct code across a variety of topics in chemistry and their accuracy can be increased by 30 percentage points via prompt engineering strategies, like putting copyright notices at the top of files. These dataset and evaluation tools are open source which can be contributed to or built upon by future researchers, and will serve as a community resource for evaluating the performance of new models as they emerge. We also describe some good practices for employing LLMs in chemistry. The general success of these models demonstrates that their impact on chemistry teaching and research is poised to be enormous.

show abstract

“…Other approaches to using DNNs for interpolation involve using the DNN to learn a probabilistic model of the data [5], and generate the interpolated values using the learned data distribution (with applications to natural language processing and time series analysis). In NLP the possibility of language models to learn to infill (missing parts of ) text [2] can also be considered close to a extrapolation method.…”

Section: Interpolation By Deep Neural Networkmentioning

confidence: 99%

Deep Conditional Measure Quantization

Turinici¹

2023

Preprint

View full text Add to dashboard Cite

The quantization of a (probability) measure is replacing it by a sum of Dirac masses that is close enough to it (in some metric space of probability measures). Various methods exists to do so, but the situation of quantizing a conditional law has been less explored. We propose a method, called DCMQ, involving a Huber-energy kernel-based approach coupled with a deep neural network architecture. The method is tested on several examples and obtains promising results.

show abstract

Efficient Training of Language Models to Fill in the Middle

Cited by 19 publications

References 24 publications

Assessment of chemistry knowledge in large language models that generate code

Assessment of chemistry knowledge in large language models that generate code

Assessment of chemistry knowledge in large language models that generate code

Deep Conditional Measure Quantization

Contact Info

Product

Resources

About