The satisfiability problem (SAT) is one of the most famous problems in computer science. Traditionally, its NP-completeness has been used to argue that SAT is intractable. However, there have been tremendous practical advances in recent years that allow modern SAT solvers to solve instances with millions of variables and clauses. A particularly successful paradigm in this context is stochastic local search (SLS).
In most cases, there are different ways of formulating the underlying SAT problem. While it is known that the precise formulation of the problem has a significant impact on the runtime of solvers, finding a helpful formulation is generally non-trivial. The recently introduced
GapSAT
solver [Lorenz and Wörz 2020] demonstrated a successful way to improve the performance of an SLS solver on average by learning additional information which logically entails from the original problem. Still, there were also cases in which the performance slightly deteriorated. This justifies in-depth investigations into how learning logical implications affects runtimes for SLS algorithms.
In this work, we propose a method for generating logically equivalent problem formulations, generalizing the ideas of
GapSAT
. This method allows a rigorous mathematical study of the effect on the runtime of SLS SAT solvers. Initially, we conduct empirical investigations. If the modification process is treated as random, Johnson SB distributions provide a perfect characterization of the hardness. Since the observed Johnson SB distributions approach lognormal distributions, our analysis also suggests that the hardness is long-tailed.
As a second contribution, we theoretically prove that restarts are useful for long-tailed distributions. This implies that incorporating additional restarts can further refine
all
algorithms employing above mentioned modification technique.
Since the empirical studies compellingly suggest that the runtime distributions follow Johnson SB distributions, we also investigate this property on a theoretical basis. We succeed in proving that the runtimes for the special case of Schöning’s random walk algorithm [Schöning 2002] are approximately Johnson SB distributed.