As promised (better late than never), I’m going to begin explaining the (in)famous P vs NP question (see the previous post for a bit more context). As a start, here’s a super-concise, 30,000-foot version of the question:
Are there problems whose solutions can be efficiently verified, but which cannot be efficiently solved?
Of course, this is so vague as to be almost meaningless! My job over the next few posts will be to un-vagueify it for you. Looking at the statement, three questions may immediately come to mind, and we’ll consider each in turn:
- What counts as a problem?
- What do we mean by efficient?
- What do we mean by verify?
Today we’ll start by considering the first question: what counts as a problem?
The P vs NP question is usually stated in terms of decision problems, which are problems with a yes/no answer. For example, Is there a way to drive from here to Kalamazoo without driving on a major highway? Either there is a way to do it, or there isn’t, and we might hope that a computer (given an appropriate map as input) would be able to tell us which.
Of course, not every problem is a decision problem. For example, How long does it take to get to Kalamazoo if you go by the fastest possible route? We might hope that a computer could answer this question too, but the answer we are looking for is an amount of time, rather than yes or no. This is not an isolated example; there are lots of interesting and important problems like this. So why restrict ourselves to just decision problems?
It’s easy to see that decision problems are the simplest possible kind of problem. The answer carries just a single “bit” of information. (The only thing that could possibly be simpler than a problem with a yes/no answer is a problem with a “no/no” answer, which carries zero information. For example, “Does this dress make me look fat?” But these aren’t the sorts of problems we are usually interested in solving with a computer.) There is a long scientific tradition of studying the simplest possible kind of something, in order to gain insight into more complicated things as well. That’s what we’re doing here.
But can studying decision problems really give us insight into problems with more complicated answers as well? Actually, yes! To see an example of what I mean, consider again our question How long does it take to get to Kalamazoo if you go by the fastest possible route? This is a problem with a “complicated” answer (you might think that a single number is not very complicated, but you have to admit that it is way more complicated than a simple yes/no!). But now consider this related question: Is it possible to get to Kalamazoo in minutes or fewer? Actually, this is a whole family of questions parameterized by the time . For any particular , this is clearly a decision problem with a yes or no answer. But if we had a computer program which could efficiently solve this problem for any , we could use it to solve the original problem to any desired accuracy. For example, we might ask the following series questions, with the answer to each question shown in bold:
- Is it possible to get to Kalamazoo in 1000 minutes or fewer? YES
- Is it possible to get to Kalamazoo in 500 minutes or fewer? NO
- Is it possible to get to Kalamazoo in 750 minutes or fewer? YES
- Is it possible to get to Kalamazoo in 625 minutes or fewer? NO
- and so on…
After only a few more questions, we would learn that if we go by the fastest possible route and go exactly the speed limit the whole time and never stop for food or gas or to use a restroom, it would take exactly 665 minutes to get to Kalamazoo.
“That’s all very well,” you might say, “I can see how that works for problems with numerical answers. But what about answers that are more complicated yet?” For example, what if we wanted to know not only the time it takes to get to Kalamazoo by the fastest route, but also the route itself? This is certainly not just a number, it is a list of turns, or a list of street names, or something complicated like that. Surely we can’t reduce this to just some decision problem?
Well, no matter how complicated the answer to a problem, it is always possible to encode the answer in binary, as a sequence of ones and zeros. For example, to encode a series of turns we might agree to encode a left turn as 01, a right turn as 10, going straight as 11, and use 00 to indicate the end of the directions. So 011110101100 would mean “Turn left, then go straight, then turn right twice, go straight again, and you’re there.” In order to encode street names we would need a much more complex encoding but it can be done easily enough. For example we might encode the letter A as 00000, B as 00001, C as 00010, and so on.
OK, so what? Here’s the punchline: given a problem P with some complicated answer and a suitable way to encode the answer to P in binary, we can make the following decision problem:
Is the th binary digit in the binary encoding of the answer to P a one?
I hope you can see how we could use a computer program capable of solving this problem for any to solve the original problem: just ask about the binary digit (or bit) at each position, and build up the answer to P one bit at a time.
Perhaps you think this is bit contrived. But it shows that in some sense, there’s nothing fundamentally interesting going on with problems that have complicated answers; we can always reduce them to decision problems. So for these reasons (and perhaps a few others), the P vs NP question is phrased in terms of decision problems only. We can now restate the P vs NP question a bit more precisely:
Are there decision problems whose solutions can be efficiently verified, but which cannot be efficiently solved?
In my next post, I’ll explain what we mean by that hand-wavy word efficiently.
hello,
i wanted to say that as a math undergraduate student, I am very happy to have stumbles upon this blog. i feel it clarifies questions i never dared to ask, and poses interesting problems that i am glad to think about. all in all, i want this comment to be taken as encouragement to continue what i think is a great blog with fascinating topics.
thank you
Sam, thanks for the encouragement! I’m very glad you enjoy it.
This is really interesting as usual, but I’m just wondering about one thing, and that’s your definition of ‘solving’ something. I have no experience in computer science, so when I think of ‘solving’, say, x^2 – 2x + 1 = 0, I don’t think of finding answers ‘as accurate as you want,’ I think of literally figuring out that x = 1.
When you talk of a computer program figuring out the shortest time to get from A to B, you use a method of finding this time to a ‘sufficiently accurate degree,’ rather than of actually finding the real value of this fastest time. Is this just what solving a problem means in computer science?
Hi Tom, great question. There’s actually nothing special about computer science here; the issue arises in a purely mathematical setting as well. The fact is that not every problem CAN be solved exactly. Something like x^2 – 2x + 1 = 0 is a very special case since it has the EXACT solution x = 1 (and computers can solve such problems exactly, just as you would). But consider an equation like x^5 – 3x + 2 = 0. We know there must be a real solution for x (the graph of any odd-degree polynomial goes off to infinity in different directions and hence must cross the x-axis at least once), and the solution is approximately 0.632835… but it turns out there is no better way to write the solution for x other than to write an approximation to as many decimal places as you want. Likewise, the shortest time from A to B might not be a “nice” number — any solution you care to write down will necessarily be an approximation. Ultimately what it means to “solve” a problem depends on the problem and on what sorts of things you are willing to accept as solutions.
Hmm… hopefully this helps, I’m just skimming the surface here. There’s probably an interesting blog post or two in there, come to think of it.
Haha, I figured the definition of ‘solve’ isn’t quite universal. The example with a 5th degree polynomial makes sense… though I still feel that finding a million decimal places isn’t quite ‘solving it.’ (On a semi-related side note, is it even possible that there is an exact, expressible solution out there that we haven’t found? Or are solutions to these polynomials transcendental?)
For what it counts, as another (junior) undergrad, I agree 100% with Sam Most math blogs are either by Terrence Tao et. al. and completely incomprehensible, or in some way not as interesting as this blog. Cheers
Well, solutions to polynomials are never transcendental (by definition — transcendental numbers are exactly those numbers which are not roots of polynomials with integer coefficients). So I actually lied a bit — there IS a nice, finite way to describe the real root of x^5 – 3x + 2. Here it is: “the real root of x^5 – 3x + 2”. Now, you might think that is cheating a bit, but you have to admit it is precise, unambiguous, and finite! (It so happens that x^5 – 3x + 2 has only one real root, but for polynomials with more you could say something like “the largest real root…” or “the second largest…” or whatever.) As for other nice ways to write it down — there may be some, but I don’t know what they are. What I do know (and why I chose degree 5) is that in general the roots of polynomials of degree 5 and above cannot be written down using just arithmetic operations and roots (see http://en.wikipedia.org/wiki/Insolubility_of_the_quintic), except in special cases.
HEEEEYYY! Im Jason and Im 12 and im doing a project in math on p vs np and itsss SSSOOOOOOOOOOO confusing… When are you going to post the explanations of the other two other questionsss?????? THIS IS SUCKYYYY
I will risk embarrassment by correcting a mathematician about math…I think the direction code that you provide (011110101100 ) should indicate “turn right twice” not “turn left twice” as you’ve said. Right?
Great blog!
jon: Absolutely right, thanks! It’s fixed now.