Inspired by the success of the DRAT proof format for certification of boolean satisfiability (SAT), we argue that a similar goal of having unified automatically checkable proofs should be sought by the developers of automatic first-order theorem provers (ATPs). This would not only help to further increase assurance about the correctness of prover results, but would also be indispensable for tools which rely on ATPs, such as "hammers" employed within interactive theorem provers. The current situation, represented by the TSTP format, is unsatisfactory, because this format does not have a standardised semantics and thus cannot be checked automatically. Providing such semantics, however, is a challenging endeavour. One would ideally like to have a proof format which covers only-satisfiability-preserving operations such as Skolemisation and is versatile enough to encompass various proving methods (i.e. not just superposition) or is perhaps even open-ended towards yet to be conceived methods or at least easily extendable in principle. Going beyond pure first-order logic to theory reasoning in the style of SMT, or beyond proofs to certification of satisfiability are further interesting challenges. Although several projects have already provided partial solutions in this direction, we would like to use the opportunity of ARCADE to further promote the idea and gather critical mass needed for its satisfactory realisation.
The challengeWe would like to propose to the first-order ATP community the challenge of designing, implementing and bringing into practice a unified mechanically checkable proof format along with an efficient proof checker. The format should support the whole reasoning pipeline including formula preprocessing, be sufficiently general to cover all the solving techniques currently employed by ATPs, and be open to future extensions for proof recording of techniques yet to be developed. In this paper, we summarise the current situation regarding proof output of ATPs, explain why we think striving for a mechanically checkable proof format is a worthy effort, list the main properties we believe an ideal format should satisfy, attempt to give an overview of work already done in the first-order ATP community and related areas, and, finally, suggest possible avenues and the next steps to be taken for meeting the challenge.At this point we add the disclaimer that other people have already examined this challenge in various ways. We attempt to present this previous work and do not claim that what we are suggesting is novel, but instead we are calling for further work in this area. Our main aim at ARCADE is to solicit opinions from experts on why the proposed idea has not yet made its way to practice and on how exactly should the community proceed to achieve the envisioned goal.