2-satisfiability
In computer science, 2-satisfiability, 2-SAT or just 2SAT is a computational problem of assigning values to variables, each of which has two possible values, in order to satisfy a system of constraints on pairs of variables. It is a special case of the general Boolean satisfiability problem, which can involve constraints on more than two variables, and of constraint satisfaction problems, which can allow more than two choices for the value of each variable. But in contrast to those more general problems, which are NP-complete, 2-satisfiability can be solved in polynomial time.
Instances of the 2-satisfiability problem are typically expressed as Boolean formulas of a special type, called conjunctive normal form or Krom formulas. Alternatively, they may be expressed as a special type of directed graph, the implication graph, which expresses the variables of an instance and their negations as vertices in a graph, and constraints on pairs of variables as directed edges. Both of these kinds of inputs may be solved in linear time, either by a method based on backtracking or by using the strongly connected components of the implication graph. Resolution, a method for combining pairs of constraints to make additional valid constraints, also leads to a polynomial time solution. The 2-satisfiability problems provide one of two major subclasses of the conjunctive normal form formulas that can be solved in polynomial time; the other of the two subclasses is Horn-satisfiability.
2-satisfiability may be applied to geometry and visualization problems in which a collection of objects each have two potential locations and the goal is to find a placement for each object that avoids overlaps with other objects. Other applications include clustering data to minimize the sum of the diameters of the clusters, classroom and sports scheduling, and recovering shapes from information about their cross-sections.
In computational complexity theory, 2-satisfiability provides an example of an NL-complete problem, one that can be solved non-deterministically using a logarithmic amount of storage and that is among the hardest of the problems solvable in this resource bound. The set of all solutions to a 2-satisfiability instance can be given the structure of a median graph, but counting these solutions is #P-complete and therefore not expected to have a polynomial-time solution. Random instances undergo a sharp phase transition from solvable to unsolvable instances as the ratio of constraints to variables increases past 1, a phenomenon conjectured but unproven for more complicated forms of the satisfiability problem. A computationally difficult variation of 2-satisfiability, finding a truth assignment that maximizes the number of satisfied constraints, has an approximation algorithm whose optimality depends on the unique games conjecture, and another difficult variation, finding a satisfying assignment minimizing the number of true variables, is an important test case for parameterized complexity.
Problem representations
A 2-satisfiability problem may be described using a Boolean expression with a special restricted form. It is a conjunction of clauses, where each clause is a disjunction of two variables or negated variables. The variables or their negations appearing in this formula are known as literals. For example, the following formula is in conjunctive normal form, with seven variables, eleven clauses, and 22 literals:The 2-satisfiability problem is to find a truth assignment to the variables of a formula in this form that makes the whole formula true. Such an assignment chooses whether to make each of the variables true or false, so that at least one literal in every clause becomes true. For the expression shown above, one possible satisfying assignment is the one that sets all seven of the variables to true. Every clause has at least one non-negated variable, so this assignment satisfies every clause. There are also 15 other ways of setting all the variables so that the formula becomes true. Therefore, the 2-satisfiability instance represented by this expression is satisfiable.
Formulas in this form are known as 2-CNF formulas. The "2" in this name stands for the number of literals per clause, and "CNF" stands for conjunctive normal form, a type of Boolean expression in the form of a conjunction of disjunctions. They are also called Krom formulas, after the work of UC Davis mathematician Melven R. Krom, whose 1967 paper was one of the earliest works on the 2-satisfiability problem.
Each clause in a 2-CNF formula is logically equivalent to an implication from one variable or negated variable to the other. For example, the second clause in the example may be written in any of three equivalent ways:
Because of this equivalence between these different types of operation, a 2-satisfiability instance may also be written in implicative normal form, in which we replace each or clause in the conjunctive normal form by the two implications to which it is equivalent.
A third, more graphical way of describing a 2-satisfiability instance is as an implication graph. An implication graph is a directed graph in which there is one vertex per variable or negated variable, and an edge connecting one vertex to another whenever the corresponding variables are related by an implication in the implicative normal form of the instance. An implication graph must be a skew-symmetric graph, meaning that it has a symmetry that takes each variable to its negation and reverses the orientations of all of the edges.
Algorithms
Several algorithms are known for solving the 2-satisfiability problem. The most efficient of them take linear time.Resolution and transitive closure
described the following polynomial time decision procedure for solving 2-satisfiability instances.Suppose that a 2-satisfiability instance contains two clauses that both use the same variable x, but that x is negated in one clause and not in the other. Then the two clauses may be combined to produce a third clause, having the two other literals in the two clauses; this third clause must also be satisfied whenever the first two clauses are both satisfied. This is called resolution. For instance, we may combine the clauses and in this way to produce the clause. In terms of the implicative form of a 2-CNF formula, this rule amounts to finding two implications and, and inferring by transitivity a third implication.
Krom writes that a formula is consistent if repeated application of this inference rule cannot generate both the clauses and, for any variable. As he proves, a 2-CNF formula is satisfiable if and only if it is consistent. For, if a formula is not consistent, it is not possible to satisfy both of the two clauses and simultaneously. And, if it is consistent, then the formula can be extended by repeatedly adding one clause of the form or at a time, preserving consistency at each step, until it includes such a clause for every variable. At each of these extension steps, one of these two clauses may always be added while preserving consistency, for if not then the other clause could be generated using the inference rule. Once all variables have a clause of this form in the formula, a satisfying assignment of all of the variables may be generated by setting a variable to true if the formula contains the clause and setting it to false if the formula contains the clause.
Krom was concerned primarily with completeness of systems of inference rules, rather than with the efficiency of algorithms. However, his method leads to a polynomial time bound for solving 2-satisfiability problems. By grouping together all of the clauses that use the same variable, and applying the inference rule to each pair of clauses, it is possible to find all inferences that are possible from a given 2-CNF instance, and to test whether it is consistent, in total time, where is the number of variables in the instance. This formula comes from multiplying the number of variables by the number of pairs of clauses involving a given variable, to which the inference rule may be applied. Thus, it is possible to determine whether a given 2-CNF instance is satisfiable in time. Because finding a satisfying assignment using Krom's method involves a sequence of consistency checks, it would take time. quote a faster time bound of for this algorithm, based on more careful ordering of its operations. Nevertheless, even this smaller time bound was greatly improved by the later linear time algorithms of and.
In terms of the implication graph of the 2-satisfiability instance, Krom's inference rule can be interpreted as constructing the transitive closure of the graph. As observes, it can also be seen as an instance of the Davis–Putnam algorithm for solving satisfiability problems using the principle of resolution. Its correctness follows from the more general correctness of the Davis–Putnam algorithm. Its polynomial time bound follows from the fact that each resolution step increases the number of clauses in the instance, which is upper bounded by a quadratic function of the number of variables.
Limited backtracking
describe a technique involving limited backtracking for solving constraint satisfaction problems with binary variables and pairwise constraints. They apply this technique to a problem of classroom scheduling, but they also observe that it applies to other problems including 2-SAT.The basic idea of their approach is to build a partial truth assignment, one variable at a time. Certain steps of the algorithms are "choice points", points at which a variable can be given either of two different truth values, and later steps in the algorithm may cause it to backtrack to one of these choice points. However, only the most recent choice can be backtracked over. All choices made earlier than the most recent one are permanent.
Initially, there is no choice point, and all variables are unassigned. At each step, the algorithm chooses the variable whose value to set, as follows:
- If there is a clause both of whose variables are already set, in a way that falsifies the clause, then the algorithm backtracks to its most recent choice point, undoing the assignments it made since that choice, and reverses the decision made at that choice. If there is no choice point, or if the algorithm has already backtracked over the most recent choice point, then it aborts the search and reports that the input 2-CNF formula is unsatisfiable.
- If there is a clause in which one of the clause's two variables has already been set, and the clause could still become either true or false, then the other variable is set in a way that forces the clause to become true.
- In the remaining case, each clause is either guaranteed to become true no matter how the remaining variables are assigned, or neither of its two variables has been assigned yet. In this case the algorithm creates a new choice point and sets any one of the unassigned variables to an arbitrarily chosen value.
Even et al. did not describe in detail how to implement this algorithm efficiently. They state only that by "using appropriate data structures in order to find the implications of any decision", each step of the algorithm can be performed quickly. However, some inputs may cause the algorithm to backtrack many times, each time performing many steps before backtracking, so its overall complexity may be nonlinear. To avoid this problem, they modify the algorithm so that, after reaching each choice point, it begins simultaneously testing both of the two assignments for the variable set at the choice point, spending equal numbers of steps on each of the two assignments. As soon as the test for one of these two assignments would create another choice point, the other test is stopped, so that at any stage of the algorithm there are only two branches of the backtracking tree that are still being tested. In this way, the total time spent performing the two tests for any variable is proportional to the number of variables and clauses of the input formula whose values are permanently assigned. As a result, the algorithm takes linear time in total. In experimental comparisons on randomly-generated instances, a version of this backtracking algorithm was faster than other linear-time algorithms.