BliStrTune: hierarchical invention of theorem proving strategies

Lecture Notes in Computer Science

Suda

et al. 2019

Self Cite

We describe an efficient implementation of clause guidance in saturation-based automated theorem provers extending the ENIGMA approach. Unlike in the first ENIGMA implementation where fast linear classifier is trained and used together with manually engineered features, we have started to experiment with more sophisticated state-of-the-art machine learning methods such as gradient boosted trees and recursive neural networks. In particular the latter approach poses challenges in terms of efficiency of clause evaluation, however, we show that deep integration of the neural evaluation with the ATP data-structures can largely amortize this cost and lead to competitive real-time results. Both methods are evaluated on a large dataset of theorem proving problems and compared with the previous approaches. The resulting methods improve on the manually designed clause guidance, providing the first practically convincing application of gradient-boosted and neural clause guidance in saturation-style automated theorem provers. IntroductionAutomated theorem provers (ATPs) [32] have been developed for decades by manually designing proof calculi and search heuristics. Their power has been growing and they are already very useful, e.g., as parts of large interactive theorem proving (ITP) verification toolchains (hammers) [5]. On the other hand, with small exceptions, ATPs are still significantly weaker than trained mathematicians in finding proofs in most research domains.Recently, machine learning over large formal corpora created from ITP libraries [37,28,19] has started to be used to develop guidance of ATP systems [39,25,2]. This has already produced strong systems for selecting relevant facts for proving new conjectures over large formal libraries [1,4,9]. More recently, machine learning has also started to be used to guide the internal search of the ATP systems. In sophisticated saturation-style provers this has been done by feedback loops for strategy invention [38,16,33] and by using supervised learning [14,26] to select the next given clause [27]. In the simpler connection tableau systems such as leancop [29], supervised learning has been used to choose ⋆ Supported by the ERC Consolidator grant no. 649043 AI4REASON, and by the Czech project AI&Reasoning CZ.02.1.01/0.0/0.0/15 003/0000466 and the European Regional Development Fund.

Section: Experimental Evaluationmentioning

confidence: 99%

ENIGMA-NG: Efficient Neural and Gradient-Boosted Inference Guidance for E

Chvalovský

Lecture Notes in Computer Science

Suda

et al. 2019

Self Cite

“…The Mizar theorem YELLOW 5:36 14 states De Morgan's laws for Boolean lattices: Using 32 related proofs results in 2220 clauses placed on the watchlists. The dynamically guided proof search takes 5218 (nontrivial) given clause loops done in 2 s and the resulting ATP proof is 436 inferences long.…”

Section: Examplesmentioning

confidence: 99%

ProofWatch: Watchlist Guidance for Large Theories in E

Goertzel

Interactive Theorem Proving

Schulz

et al. 2018

Self Cite

Watchlist (also hint list) is a mechanism that allows related proofs to guide a proof search for a new conjecture. This mechanism has been used with the Otter and Prover9 theorem provers, both for interactive formalizations and for human-assisted proving of open conjectures in small theories. In this work we explore the use of watchlists in large theories coming from first-order translations of large ITP libraries, aiming at improving hammer-style automation by smarter internal guidance of the ATP systems. In particular, we (i) design watchlist-based clause evaluation heuristics inside the E ATP system, and (ii) develop new proof guiding algorithms that load many previous proofs inside the ATP and focus the proof search using a dynamically updated notion of proof matching. The methods are evaluated on a large set of problems coming from the Mizar library, showing significant improvement of E's standard portfolio of strategies, and also of the previous best set of strategies invented for Mizar by evolutionary methods.

“…Their advanced knowledge-based proof finding is an enigma, which is unlikely to be deciphered and programmed completely manually in near future.On large corpora such as Flyspeck, Mizar and Isabelle, the ATP progress has been mainly due to learning how to select the most relevant knowledge, based on many previous proofs [10,12,1,2]. Learning from many proofs has also recently become a very useful method for automated finding of parameters of ATP strategies [22,9,19,16], and for learning of sequences of tactics in interactive theorem provers (ITPs) [7]. Several experiments with the compact leanCoP [18] system have recently shown that directly using trained machine learner for internal clause selection can significantly prune the search space and solve additional problems [24,11,5].…”

mentioning

confidence: 99%

ENIGMA: Efficient Learning-Based Inference Guiding Machine

Lecture Notes in Computer Science

Urban

2017

Self Cite

ENIGMA is a learning-based method for guiding given clause selection in saturationbased theorem provers. Clauses from many proof searches are classified as positive and negative based on their participation in the proofs. An efficient classification model is trained on this data, using fast feature-based characterization of the clauses . The learned model is then tightly linked with the core prover and used as a basis of a new parameterized evaluation heuristic that provides fast ranking of all generated clauses. The approach is evaluated on the E prover and the CASC 2016 AIM benchmark, showing a large increase of E's performance. Introduction: Theorem Proving and LearningState-of-the-art resolution/superposition automated theorem provers (ATPs) such as Vampire [15] and E [20] are today's most advanced tools for general reasoning across a variety of mathematical and scientific domains. The stronger the performance of such tools, the more realistic become tasks such as full computer understanding and automated development of complicated mathematical theories, and verification of software, hardware and engineering designs. While performance of ATPs has steadily grown over the past years due to a number of human-designed improvements, it is still on average far behind the performance of trained mathematicians. Their advanced knowledge-based proof finding is an enigma, which is unlikely to be deciphered and programmed completely manually in near future.On large corpora such as Flyspeck, Mizar and Isabelle, the ATP progress has been mainly due to learning how to select the most relevant knowledge, based on many previous proofs [10,12,1,2]. Learning from many proofs has also recently become a very useful method for automated finding of parameters of ATP strategies [22,9,19,16], and for learning of sequences of tactics in interactive theorem provers (ITPs) [7]. Several experiments with the compact leanCoP [18] system have recently shown that directly using trained machine learner for internal clause selection can significantly prune the search space and solve additional problems [24,11,5]. An obvious next step is to implement efficient learning-based clause selection also inside the strongest superposition-based ATPs.In this work, we introduce ENIGMA -Efficient learNing-based Internal Guidance MAchine for state-of-the-art saturation-based ATPs. The method applies fast machine learning algorithms to a large number of proofs, and uses the trained classifier together with simpler heuristics to evaluate the millions of clauses generated during the resolution/superposition proof search. This way, the theorem prover automatically takes into account thousands of previous successes and failures that it has seen in previous problems, similarly to trained humans. Thanks to a carefully chosen efficient learning/evaluation method and its tight integration with the core ATP (in our case the E prover), the penalty for this ubiquitous knowledge-based internal proof guidance is very low. This in turn very significantly improves the per...