Boolean satisfiability problem
In logic and computer science, the Boolean satisfiability problem asks whether there exists an interpretation that satisfies a given Boolean formula. In other words, it asks whether the formula's variables can be consistently replaced by the values TRUE or FALSE to make the formula evaluate to TRUE. If this is the case, the formula is called satisfiable, else unsatisfiable. For example, the formula "a AND NOT b" is satisfiable because one can find the values a = TRUE and b = FALSE, which make = TRUE. In contrast, "a AND NOT a" is unsatisfiable.
SAT is the first problem that was proven to be NP-complete—this is the Cook–Levin theorem. This means that all problems in the complexity class NP, which includes a wide range of natural decision and optimization problems, are at most as difficult to solve as SAT. There is no known algorithm that efficiently solves each SAT problem. Although such an algorithm is generally believed not to exist, this belief has not been proven or disproven mathematically. Resolving the question of whether SAT has a polynomial-time algorithm would settle the P versus NP problem - one of the most important open problems in the theory of computing.
Nevertheless, as of 2007, heuristic SAT-algorithms are able to solve problem instances involving tens of thousands of variables and formulas consisting of millions of symbols, which is sufficient for many practical SAT problems occurring in artificial intelligence, circuit design, and automatic theorem proving.
Definitions
A propositional logic formula, also called Boolean expression, is built from variables, operators AND, OR, NOT, and parentheses. A formula is said to be satisfiable if it can be made TRUE by assigning appropriate logical values to its variables. The Boolean satisfiability problem is, given a formula, to check whether it is satisfiable. This decision problem is of central importance in many areas of computer science, including theoretical computer science, complexity theory, algorithmics, cryptography and artificial intelligence.Conjunctive normal form
A literal is either a variable or the negation of a variable. A clause is a disjunction of literals. A clause is called a Horn clause if it contains at most one positive literal. A formula is in conjunctive normal form if it is a conjunction of clauses.For example, is a positive literal, is a negative literal, and is a clause. The formula is in conjunctive normal form; its first and third clauses are Horn clauses, but its second clause is not. The formula is satisfiable, by choosing x1 = FALSE, x2 = FALSE, and x3 arbitrarily, since ∧ ∧ ¬FALSE evaluates to ∧ ∧ TRUE, and in turn to TRUE ∧ TRUE ∧ TRUE. In contrast, the CNF formula a ∧ ¬a, consisting of two clauses of one literal, is unsatisfiable, since for a=TRUE or a=FALSE it evaluates to TRUE ∧ ¬TRUE or FALSE ∧ ¬FALSE, respectively.
For some versions of the SAT problem, it is useful to define the notion of a generalized conjunctive normal form formula, viz. as a conjunction of arbitrarily many generalized clauses, the latter being of the form for some Boolean function R and literals. Different sets of allowed Boolean functions lead to different problem versions. As an example, R is a generalized clause, and R ∧ R ∧ R is a generalized conjunctive normal form. This formula is used [|below], with R being the ternary operator that is TRUE just when exactly one of its arguments is.
Using the laws of Boolean algebra, every propositional logic formula can be transformed into an equivalent conjunctive normal form, which may, however, be exponentially longer. For example, transforming the formula ∨ ∨... ∨ into conjunctive normal form yields
while the former is a disjunction of n conjunctions of 2 variables, the latter consists of 2n clauses of n variables.
However, with use of the Tseytin transformation, we may find an equisatisfiable conjunctive normal form formula with length linear in the size of the original propositional logic formula.
Complexity
SAT was the first problem known to be NP-complete, as proved by Stephen Cook at the University of Toronto in 1971 and independently by Leonid Levin at the Russian Academy of Sciences in 1973. Until that time, the concept of an NP-complete problem did not even exist. The proof shows how every decision problem in the complexity class NP can be reduced to the SAT problem for CNF formulas, sometimes called CNFSAT. A useful property of Cook's reduction is that it preserves the number of accepting answers. For example, deciding whether a given graph has a 3-coloring is another problem in NP; if a graph has 17 valid 3-colorings, then the SAT formula produced by the Cook–Levin reduction will have 17 satisfying assignments.NP-completeness only refers to the run-time of the worst case instances. Many of the instances that occur in practical applications can be solved much more quickly. See [|§Algorithms for solving SAT] below.
3-satisfiability
Like the satisfiability problem for arbitrary formulas, determining the satisfiability of a formula in conjunctive normal form where each clause is limited to at most three literals is NP-complete also; this problem is called 3-SAT, 3CNFSAT, or 3-satisfiability. To reduce the unrestricted SAT problem to 3-SAT, transform each clause to a conjunction of clauseswhere are fresh variables not occurring elsewhere. Although the two formulas are not logically equivalent, they are equisatisfiable. The formula resulting from transforming all clauses is at most 3 times as long as its original; that is, the length growth is polynomial.
3-SAT is one of Karp's 21 NP-complete problems, and it is used as a starting point for proving that other problems are also NP-hard. This is done by polynomial-time reduction from 3-SAT to the other problem. An example of a problem where this method has been used is the clique problem: given a CNF formula consisting of c clauses, the corresponding graph consists of a vertex for each literal, and an edge between each two non-contradicting literals from different clauses; see the picture. The graph has a c-clique if and only if the formula is satisfiable.
There is a simple randomized algorithm due to Schöning that runs in time n where n is the number of variables in the 3-SAT proposition, and succeeds with high probability to correctly decide 3-SAT.
The exponential time hypothesis asserts that no algorithm can solve 3-SAT in time.
Selman, Mitchell, and Levesque give empirical data on the difficulty of randomly generated 3-SAT formulas, depending on their size parameters. Difficulty is measured in number recursive calls made by a DPLL algorithm. They identified a phase transition region from almost-certainly-satisfiable to almost-certainly-unsatisfiable formulas at the clauses-to-variables ratio at about 4.26.
3-satisfiability can be generalized to k-satisfiability, when formulas in CNF are considered with each clause containing up to k literals. However, since for any k ≥ 3, this problem can neither be easier than 3-SAT nor harder than SAT, and the latter two are NP-complete, so must be k-SAT.
Some authors restrict k-SAT to CNF formulas with exactly k literals. This does not lead to a different complexity class either, as each clause with j < k literals can be padded with fixed dummy variables to. After padding all clauses, 2k–1 extra clauses must be appended to ensure that only can lead to a satisfying assignment. Since k does not depend on the formula length, the extra clauses lead to a constant increase in length. For the same reason, it does not matter whether duplicate literals are allowed in clauses, as in.
Special instances of 3SAT
Conjunctive normal form
Conjunctive normal form is often considered the canonical representation for SAT formulas. As shown above, the general SAT problem reduces to 3-SAT, the problem of determining satisfiability for formulas in this form.Linear SAT
A 3-SAT formula is Linear SAT if each clause intersects at most one other clause, and, moreover, if two clauses intersect, then they have exactly one literal in common. An LSAT formula can be depicted as a set of disjoint semi-closed intervals on a line. Deciding whether an LSAT formula is satisfiable is NP-complete.2-satisfiability
SAT is easier if the number of literals in a clause is limited to at most 2, in which case the problem is called 2-SAT. This problem can be solved in polynomial time, and in fact is complete for the complexity class NL. If additionally all OR operations in literals are changed to XOR operations, then the result is called exclusive-or 2-satisfiability, which is a problem complete for the complexity class SL = L.Horn-satisfiability
The problem of deciding the satisfiability of a given conjunction of Horn clauses is called Horn-satisfiability, or HORN-SAT. It can be solved in polynomial time by a single step of the unit propagation algorithm, which produces the single minimal model of the set of Horn clauses. Horn-satisfiability is P-complete. It can be seen as P's version of the Boolean satisfiability problem. Also, deciding the truth of quantified Horn formulas can be done in polynomial time.Horn clauses are of interest because they are able to express implication of one variable from a set of other variables. Indeed, one such clause ¬x1 ∨... ∨ ¬xn ∨ y can be rewritten as x1 ∧... ∧ xn → y; that is, if x1,...,xn are all TRUE, then y must be TRUE as well.
A generalization of the class of Horn formulas is that of renameable-Horn formulae, which is the set of formulas that can be placed in Horn form by replacing some variables with their respective negation. For example, ∧ ∧ ¬x1 is not a Horn formula, but can be renamed to the Horn formula ∧ ∧ ¬x1 by introducing y3 as negation of x3. In contrast, no renaming of ∧ ∧ ¬x1 leads to a Horn formula. Checking the existence of such a replacement can be done in linear time; therefore, the satisfiability of such formulae is in P as it can be solved by first performing this replacement and then checking the satisfiability of the resulting Horn formula.