CSCC63 – Week 12

This week: We’ll do a course review.

Course Review

The topics we’ve been looking at in this course have been about the theoretical limits of

computation – what can we ever hope to be able to do with computers, and even logic itself?

By computation, we mean any computer we could ever build, not just any computer that we have

today.

The biggest takeaway is this: there are limits to what computation can do.

If this wasn’t true, we’d be able to build a liar’s paradox into logic to get a contradiction.

Remember, a liar’s paradox is a statement like, “This statement is false". It can’t be either

true or false, which is why we can’t express it in logic, where statements must be either

true or false.

We’ve demonstrated that:

• Not all problems can be solved at all.

Yes/no questions that can be solved are decidable.

• Even fewer problems can be solved eciently (in polynomial time).

Yes/no questions that can be solved eciently are in the class P.

• There are problems that we may or may not be able to nd a solution for, but whose

solutions can be veried when the answer is yes. The issue is that it may be impossible

to show that no solution exists when the answer is no.

Yes/no questions whose yes-instances can be veried are recognizable.

• Some problems may or may admit ecient solutions, but can be veried eciently.

Yes/no problems that can be veried eciently are in the class NP. Determining whether

this is the same as the class P is one of the biggest problems in theoretical computer science.

• There are problems that we may or may not be able to nd a solution for, but whose

complements are recognizable.

These are co-recognizable.

• Some problems are neither recognizable nor co-recognizable.

The dierence between solving a problem and verifying a solution is one of the more

interesting ideas we’ve seen, actually. Have you ever had diculty solving a problem,

only to nd that the solution made perfect sense when you nally saw it? This is a similar

idea.

1

The problems that are not solvable can take many forms — since we initially prove undecid-

ability by using self-reference, it shouldn’t be a surprise that many questions regarding code

analysis are undecidable. But other natural problems are also undecidable, such as:

• Determining whether a multi-variable polynomial with integer coecients has an in-

tegral root (Hilbert’s 10th problem).

• Determining whether an arbitrary set of dominoes can be lined up so that their top and

bottom strings are the same (the Post Correspondence Problem).

• Determining whether the languages of two CFGs have an empty intersection.

In order to be able to nd relationships between arbitrary programs and the problems in

these dierent domains, we dened a mathematical object called a Turing Machine that

encodes the properties of a computer. Church’s Thesis states that anything we can do with

an algorithm can also be done with a TM.

How do we prove that a problem is hard?

There are a few ways of doing this:

• We started our course by using self-reference to build a contradiction (the liar’s para-

dox). This process is called diagonalization. Using this method let us show that HALT

is undecidable.

A similar method lets us show that P 6= EXP and L 6= PSPACE.

We’ve also argued that diagonalization will not be enough to determine whether P

= NP. This proof itself, though, relies on diagonalization. . .

But beyond the proof that HALT is undecidable, you don’t need to worry about

diagonalization for this course.

• Most of our diculty proofs have been built around the idea of reductions: If we know

that a problem A is hard, and we want to show that another problem B is also hard, we

nd a way of reformatting the instances of A into instances of B. If we could solve B,

we’d be able to solve A as well.

To get this to work, we need to show that the language A is hard That’s why it

doesn’t help us determine whether P = NP, but does let us show that if P 6= NP,

many languages are hard.

Reductions can also run into trouble when the language B is so “random", in some

sense, that computing the reduction isn’t possible, even thoughB may be very hard.

One way of approaching this is something called Kolmogorov complexity. But you

don’t need to know that for this course.

• Other methods of proving hardness exist, but are not covered in this course.

2

A surprising fact that we’ve seen is that there are problems that are complete for many of the

complexity classes we care about: these are problems in the classes which, if we could solve

them, would allow us to solve every other problem in the class.

The big example we showed was that 3SAT is NP-complete — it’s a problem in NP, and we can

solve any other problem in NP by encoding its TM as a boolean circuit in 3CNF and solving

the resulting problem.

We’ve seen the proof of this in class. I’d suggest working to understand it – or at least the rough

framework around it. The idea was that we could trace the congurations of a TM using a boolean

formula so that it accepts i the TM reaches an accept state.

As a matter of fact, we don’t believe that 3SAT can be solved in less than 2θ(n) time: so we

can’t do much better than a brute-force search. If this is true, then logic is sometimes very

limited in what it allows us to do.

On the other hand, if we were able to solve these problems quickly, then human creativity

in problem solving really isn’t much use: an automated problem solver would work just

as well.

These classes — P and NP, the decidable languages, and so on, are all classes of languages:

their elements are sets of strings. But we can also talk about computable functions, that is,

about the return values that we can reasonable expect a program to compute.

These functions have analogues to the language classes — some functions can be computed

for all possible input values, while some can only be calculated for some inputs. A function

that can be computed for all inputs is a computable function.

Functions that can be computed for all inputs in polynomial time are referred to as the class FP.

There is a class of functions that is equivalent to NP, as well — this class would. for example,

include the function that takes as input a graph G and returns the largest clique in G. This class

is referred to as FNP. Any function in FNP is downward-self-reducible — that is, if we could

decide the corresponding decision problem in NP, we could also calculate these functions. We saw

examples of these reductions when we looked at how to nd the largest clique in a graph G in

tutorial.

We didn’t actually see the name “downward-self-reducible” before, but we have done the

calculations. The calculations are what you need to know.

When we look at functions, though, there’s another important class: the class of functions

that, given a problem in NP (actually a verier for a problem L in NP) and an instance x,

counts the number of certicates for x inL. This class of functions is known as #P, pronounced

sharp-P.

The class #P is important because it shows up when we mix probability and complexity

— it is equivalent to asking, “If we randomly choose a certicate c for an instance x,

what is the probability that c is a satisfying certicate?”. This sort of problem shows up

very naturally when, for example, performing inference over Bayesian networks.

3

This class is believed to be much harder to solve than the classes NP and FNP — if you can tell

that a 3CNF φ is satisable, you’ll still have a hard time telling how many truth assignments

will work. A reasonable language class analogue for #P is the majority class PP. Languages in

this class consist of NP instances that are accepted by a majority of their certicates.

Reductions A 6p B in #P look a lot like the reductions we use in NP and PSPACE, with one

dierence: we make sure that there is a regular (e.g., 1-to-k) correspondence between the

certicates in A and the certicates in B. These reductions are called parsimonious.

We’ve also had a look at space complexity, in which we measure the memory (TM tape cells)

used by the algorithm, instead of the time taken.

The big classes we’ve seen here have been PSPACE, which is the set of problems that require a

polynomial amount of space, L, which is the set of problems that require a logarithmic amount

of space (aside from the input), and NL, which is the non-deterministic version of L.

PSPACE, in particular, is important because it encodes what happens if we have two agents

competing against each other. It is also hard for a number of important classes that are be-

lieved to be harder than NP, such as the problem of counting the number of solutions to the

instances of NP-complete problems.

If we do have to try to solve an intractable function — for example, if we want to try to guess

the value of a function in #P, we can try approximating the answer. Dierent problems have

dierent behaviours in this regard:

• We can come up with arbitrarily good approximations for a number of problems, like

Subset-Sum.

• Many other problems can be approximated to within some constant factor – we showed

how to nd a 2-approximation for Vertex-Cover.

• On the other hand, some problems such as Cliqe can’t be approximated to within any

constant factor.

We did have to make several conjectures as we went through the course, though. Here are

some of the big ones:

• Does non-determinism add any power to ecient TMs?

This is the P versus NP question. We believe fairly strongly that P 6= NP.

• Does it add any power to measure space taken, rather than time?

This is the P versus PSPACE question. We believe very strongly that P 6= PSPACE.

• Is it harder to count the number of certicates to a problem in NP than to nd one?

This is the question of whether NP 6= PP or not. We strongly believe that they are

dierent classes.

• Does NP = co-NP?

We believe that these are dierent classes.

• Do we get any extra power by letting a TM ip coins and use probability?

We believe that the answer is no – that we can always derandomize our algorithms.

4

Overall, we believe that the landscape looks like this:

. . .

EXP

PSPACE

PP

NP

NP-completeP

NL

L

We’ll spend the rest of the lectures covering example problems in preparation for the nal.

Practice

Here is a list of practice problems, most of which deal with the complexity topics in the course.

The questions labeled hard are more dicult that I would expect you do be able to do in an

exam — and in some cases will require you to refer to the textbook. However, they are good

for practice.

1. Language Classes

There are several languages below. Try to determine which language classes they be-

long to (P, NP, co-NP, NP-complete, PP, PSPACE, decidable, recognizable, etc.). Give

justication.

• Triple-Prime:

IN: An n ∈ N.

Question: Is n the product of three distinct primes a, b, and c?

• Large-Cliqe:

IN: A graph G.

Question: DoesG have a clique of size at least n−100, where n is the number

of vertices in G?

• Count-3Col:

IN: A graph G, a k ∈ N.

Question: Does G have at least k distinct 3-colourings?

5

• Is-Cliqe:

IN: A TM 〈M〉.

Question: Does M recognize Cliqe:?

• 99%-Col:

IN: A graph G.

Question: Does G have a k-colouring, where k = 0.99|V |?

• Triple-VC:

IN: A graph G.

Question: Does G have three disjoint vertex covers?

2. NP-completeness

Show these languages areNP-complete:

• Odd-Subset-Sum:

IN: A multiset S = {x1, . . . , xk}, and an odd t ∈ N.

Question: Is there a multiset A ⊆ S such that ∑

xi∈A

xi = t?

• Knapsack:

IN: A list {w1, . . . , wn} of weights, a list {v1, . . . , vn} of values, a weight limit

W , and a target value V .

Question: Is there a set I ⊆ {1, . . . , n} of indices such that ∑

i∈I

wi ≤ W and∑

i∈I

vi ≥ V ?

• (hard) Exactly-One-3SAT:

IN: A 3CNF formula φ.

Question: Is there a satisfying assignment for φ that sets exactly one literal

true per clause?

This is more time consuming than hard.

• 6=3SAT:

IN: A 3CNF formula φ.

Question: Is there a satisfying assignment for φ that sets at one literal true

per clause to false?

Note: This is question 7.26 from the text.

• (hard) Max-Cut:

IN: A graph G, a number k.

Question: Can the nodes of G be split into two sets S and T such that the

number of edges crossing from S to T is at least k?

Note: This is question 7.27 from the text.

6

• Set-Cover:

IN: A nite set X , a collection F of subsets of X such that ?

S∈F

S = X , and a

number k ∈ N.

Question: Is there a collection C ⊂ F of size at most k such that ?

S∈C

S = X?

• Hitting-String:

IN: A nite set A of strings over {0, 1, ∗} all having the same length n.

Question: Is there a string x ∈ {0, 1}∗ of length n such that for each a ∈ A

there is some i, where 1 6 i 6 n, for which the ith character ai of a and the

ith character xi are equal?

• Cliqe-or-IS:

IN: A graph G = (V,E).

Question: Does G have either a clique or an independent set of size |V |/2?

• DFSA-Intersection:

IN: A set of k cycle-free DFSAs S = {M1,M2, . . . ,Mk}

Question: Do the languages of the DFSAs have a non-empty intersection?

That is, is

?

M∈S

L(M) 6= ∅?

You know from CSCB36 that the intersection of regular languages is regular. So

why doesn’t that cause a problem with this being NP-complete?

Note: without the star-free qualier, this would be PSPACE-complete.

• Vector-Permutation:

IN: A matrix A of integers and a integer vector y.

Question: Is it possible to permute the entries of y to get a new vector ypi in

such a way that there is some integer vector x so that ypi = Ax?

Note: This question can be found in a numb er of Max-liklihood estimation prob-

lems. The fact that this is NP-complete is one of the reasons that EM problems

are NP-hard for non-convex domains.

3. PSPACE

• Consider the QBF

φ = ∃x0, ∀x1, x2, ∃x3, (x0 ∨ x2 ∨ x3) ∧ (x1 ∨ ¬(x1 ∨ x0)) ∧ (x0 ∨ x3)

Is φ in TQBF? Why?

Use the construction used in the Generalized Geography reduction from class to

build a Gφ from φ.

• Show that PSPACE is closed under the union, intersection, concatenation, and

Kleene star operations.

4. Counting and #P

• Look up the NP-completeness reductions for Ham-Path and for 3COL from your

text. Try to make these reductions parsimonious.

7

5. Approximations

• We showed in class that we can reduce 3SAT into Cliqe in such a way that ap-

proximation gaps are preserved or enlarged, and then we showed that we can

redsuce Cliqe to itself in a way that blows up approximation gaps.

What happens to the approximation gaps in 3SAT when we run them through the

reduction into Subset-Sum?

The Take-Away from this Lesson:

• We did a review of the ideas in the course, and we got ready to

look at specic questions in preparation for the nal.

Glossary:

No new terms today, since this is a review.

8