Contents
Part A: Introduction, measuring the efficiency of algorithms, polynomial and exponential time algorithms
Part B: Recognition problems, the Satisfiability Problem, non-deterministic algorithms
Part C: Classes P and NP, polynomial reductions and transformations, NP-complete problems
Part D: Cook's theorem and implications, some concluding remarks
D1. Cook's theorem
Recall:
Let A be a recognition problem.
Then, the following propositions are equivalent:
|
An outline of the proof is given below:
(1)
(2)
Since A
NP, any "yes" instance of A has the "succinct certificate" property, that
is, there is a certificate proving that the instance is indeed a "yes"
one, of size bounded by a polynomial in the instance size, and there is
a "certificate-checking" algorithm which can check the certificate for
validity, in polynomial time. A non-deterministic algorithm can then be
constructed for A as follows:
Branch out the computation so that all possible certificates can be examined in parallel. As in the example with the Integer Programming problem in section B2, this can be done while the computation tree still has only polynomial depth, because each certificate has a polynomial length. Then, for each of these possible certificates, apply the polynomial certificate-checking algorithm for A. If a "yes" certificate exists, at least one of the branches of the tree will report "yes". All the above, require polynomial time.
(3)
(1)
If A
SAT,
then every "yes" instance of A has a succinct certificate, namely the solution
to the SAT problem which results from the transformation. This certificate
is of polynomial length (because SAT
NP)
and can be checked in polynomial time by testing that all clauses are indeed
true.
(2)
(3)
This is the difficult part and not possible to follow without much formalism. What is needed, is, first to define a formal model of a non-deterministic algorithm (in terms of Turing machines, which are formal computation models) and second, to show that SAT is expressive enough to enable us to represent this model by a Boolean formula in CNF (Conjunctive Normal Form - see B1). This formula would be satisfiable if and only if the model of the non-deterministic algorithm could provide an answer in polynomial time.
For a complete proof of (1)
(2)
(3)
see [5].
A most important implication is:
| SAT is NP-complete |
SAT was therefore the "father" of all NP-complete
problems. All others followed "generation"-like on the basis of the transitive
property of polynomial transformability (see
figure in section C3). For example, SAT
IP-rec
(see example in section C2) which
means that IP-rec is also NP-complete and so also a candidate "father"
for other NP-complete problems. This allows to be more specific in the
basic NP-completeness property and say "if there is a polynomial-time algorithm
for either SAT or IP-rec or ..., there is a polynomial-time algorithm for
every problem in NP, i.e. for every "reasonable" problem". Expressed otherwise,
if only one of these "generated" NP-complete problems could be solved efficiently
this would immediately imply that class P=class NP.
The importance of Cook's result is evident.
There is now a host of NP-complete problems (the "Bible" is the excellent
book by Garey and Johnson [3]) but none of them has
been solved efficiently, i.e. in polynomial time. This is the strong evidence
for the conjecture P
NP
mentioned in section C1.
The theory of Computational Complexity started with Cook's paper [2]. However, the wealth of the consequences and its close relationship to combinatorial optimization were made clear in a classical paper by R.M.Karp [4].
To close this section, we examine another problem which we know to be in NP, the recognition version of the Maximum Clique problem, and prove that this problem is also NP-complete.
Maximum Clique-rec:
"Given a graph G= (V,E), where V is a set
of vertices (nodes) and E a set of links, and an integer k, is there a
clique of size k in G ? (i.e. a subset of V, say V", with |V"|=k and all
vertices in V" connected to each other)"
To show that Maximum Clique-rec is NP-complete,
one way is to show that SAT
Maximum
Clique-rec.
Now, given an instance of the SAT, that is a Boolean formula in CNF (Conjunctive Normal Form), we can construct an equivalent instance of Maximum Clique-rec which reports "yes" if and only if the Boolean formula is satisfiable. To do this, we construct a non-directed graph with vertices representing every literal in the formula (a literal is either a simple variable or a negated one). We add links between any two vertices in the graph, except when:
i. the vertices represent literals in the
same clause, or
ii. the two vertices represent complementary
literals, such as x and ¬x
An example is given below, for the formula
(x1 or ¬x2) and (¬x1 or ¬x2)
and (x1 or ¬x2 or x3)
It is easy to show now that this graph has a clique of size k = (number of clauses in the formula) = 3 in the example, if and only if the Boolean formula is satisfiable. Indeed, if there is a clique of that size, this corresponds to selecting exactly one literal from each clause. For if there are more than one literals from some clause, their corresponding vertices are not connected, by construction rule (i). In addition, the vertices in the clique cannot correspond to a pair of complementary literals, by construction rule (ii). Therefore a valid truth assignment is produced by setting each literal/vertex in the clique to be true.
Conversely, if the formula is satisfiable, selecting exactly one true literal from each clause leads to a clique of size k=(number of clauses in the formula).
Since the construction of this graph can
be done in polynomial time, SAT
Maximum
Clique-rec and hence, Maximum Clique-rec is NP-complete.
D2. Some concluding remarks
The somewhat dark picture on the problems that can be solved efficiently does not imply that in practice we are hopeless:
This is an over-simplification of what is currently known in the field. Many "countries" (and some "worlds") are missing. These have to do with advanced topics, beyond the scope of this introduction.
Finally, it is interesting to note that simplifying an NP-complete problem does not necessarily reduce its complexity. For example, an astonishing result is that SAT remains NP-complete even if all clauses have at most 3 literals (3-SAT). This special case would, at first glance, appear to be a rather "easy" problem but it turns out to be "at least as hard" as any other NP-complete problem (2-SAT is however is in P - there seems to be a "cut-line" between magnitudes 2 and 3 and this mystical property holds for several other problems).
Part A: Introduction, measuring the efficiency of algorithms, polynomial and exponential time algorithms
Part B: Recognition problems, the Satisfiability Problem, non-deterministic algorithms
Part C: Classes P and NP, polynomial reductions and transformations, NP-complete problems
Part D: Cook's theorem and implications, some concluding remarks