Autoformalization with Large Language Models

Wu, Yuhuai; Jiang, Albert Q.; Li, Wenda; Rabe, Markus N.; Staats, Charles; Jamnik, Mateja; Szegedy, Christian

doi:10.48550/arxiv.2205.12615

Cited by 8 publications

(10 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…); further sub-types include declarations combined with assumptions, such as "Let k be an integer such that n = 2k" (which are existentially loaded and are in need of verification) and justified claims ("Since x = 3(a + b), x is a multiple of 3"). The sentences 12 written in this CNL are converted into an internal list format whose crucial ingredients are a list of the of variables occuring in the sentence, its type (assumption, declaration, claim, annotation,...) and its actual content (which can be empty, as in the case of annotations). Thus, the sentence "Therefore, x is even" would be translated as…”

Section: The Diproche Cnl and The Internal List Formatmentioning

confidence: 99%

“…At first, the experiences reported in Avigad et al [1] with using large language models for autoformalization appear to be discouraging for this plan: Only about 11 percent of the natural language inputs were formalized correctly ( [1], p. 3). Much better results were reported in [12], where more than 25 percent of the natural language inputs (which were problems for math competitions) were translated correctly into Isabelle (ibid., p. 1). Still, for reliably checking even a simple natural language argument consisting of typically more than 10 sentences with sufficient reliability to be of didactical use to beginner's students, anything considerably below a hundred percent is not good enough.…”

Section: Introductionmentioning

confidence: 96%

See 1 more Smart Citation

Improving the Diproche CNL through Autoformalization via Large Language Models

Carl

2024

Electron. Proc. Theor. Comput. Sci.

View full text Add to dashboard Cite

The Diproche system is an automated proof checker for texts written in a controlled fragment of German, designed for didactical applications in classes introducing students to proofs for the first time. The first version of the system used a controlled natural language for which a Prolog formalization routine was written. In this paper, we explore the possibility of prompting large language models for autoformalization in the context of Diproche, with encouraging first results.

show abstract

Section: The Diproche Cnl and The Internal List Formatmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 96%

Improving the Diproche CNL through Autoformalization via Large Language Models

Carl

2024

Electron. Proc. Theor. Comput. Sci.

View full text Add to dashboard Cite

show abstract

“…LLMs are Transformers [43], which is the state of the art neural architecture for natural language proccessing. Additionally, Transformers have shown remarkable performance when being applied to classical problems in verification (e.g., [19,41,26,9]), reasoning (e.g., [28,51]), as well as the auto-formalization [36] of mathematics and formal specifications (e.g., [50,20,22]).…”

Section: Large Language Modelsmentioning

confidence: 99%

nl2spec: Interactively Translating Unstructured Natural Language to Temporal Logics with Large Language Models

Matthias¹,

Hahn²,

Mendoza³

et al. 2023

Preprint

View full text Add to dashboard Cite

A rigorous formalization of desired system requirements is indispensable when performing any verification task. This often limits the application of verification techniques, as writing formal specifications is an error-prone and time-consuming manual task. To facilitate this, we present nl2spec, a framework for applying Large Language Models (LLMs) to derive formal specifications (in temporal logics) from unstructured natural language. In particular, we introduce a new methodology to detect and resolve the inherent ambiguity of system requirements in natural language: we utilize LLMs to map subformulas of the formalization back to the corresponding natural language fragments of the input. Users iteratively add, delete, and edit these sub-translations to amend erroneous formalizations, which is easier than manually redrafting the entire formalization. The framework is agnostic to specific application domains and can be extended to similar specification languages and new neural models. We perform a user study to obtain a challenging dataset, which we use to run experiments on the quality of translations. We provide an open-source implementation, including a web-based frontend.

show abstract

“…The term autoformalization (Wang et al, 2018;Szegedy, 2020) has been coined for tasks of translating between natural language and formal specifications or proofs. Closest to our work is a very recent, independently developed, effort in translating between natural language and formal proofs using very large language models (Wu et al, 2022).…”

Section: Related Workmentioning

confidence: 99%

Formal Specifications from Natural Language

Hahn¹,

Schmitt²,

Tillman³

et al. 2022

Preprint

View full text Add to dashboard Cite

We study the ability of language models to translate natural language into formal specifications with complex semantics. In particular, we fine-tune off-the-shelf language models on three datasets consisting of structured English sentences and their corresponding formal representation: 1) First-order logic (FOL), commonly used in software verification and theorem proving; 2) linear-time temporal logic (LTL), which forms the basis for industrial hardware specification languages; and 3) regular expressions (regex), frequently used in programming and search. Our experiments show that, in these diverse domains, the language models achieve competitive performance to the respective state-of-the-art with the benefits of being easy to access, cheap to fine-tune, and without a particular need for domainspecific reasoning. Additionally, we show that the language models have a unique selling point: they benefit from their generalization capabilities from pre-trained knowledge on natural language, e.g., to generalize to unseen variable names.

show abstract

Autoformalization with Large Language Models

Cited by 8 publications

References 20 publications

Improving the Diproche CNL through Autoformalization via Large Language Models

Improving the Diproche CNL through Autoformalization via Large Language Models

nl2spec: Interactively Translating Unstructured Natural Language to Temporal Logics with Large Language Models

Formal Specifications from Natural Language

Contact Info

Product

Resources

About