In my previous post, I presented three hypothetical machines which take a positive integer as input and give us something else as output:
- a factorization machine gives us the complete prime factorization of ;
- a factor machine gives us one prime factor of ;
- a primality machine just tells us whether or not is prime (that is, it gives us an answer of either true or false).
I mentioned that we know how to make primality machines that are much faster than factorization machines, or even than factor machines. Before I finally get around to explaining how to build such fast primality machines, it’s worth explaining what I mean when I talk about a machine being “fast” or “slow”.
Of course this whole time when I have been talking about machines, what I really have in mind are algorithms, i.e. specific sets of steps to take some given input and produce a desired output. In other words, computer programs. (It’s a fascinating and wonderful fact that computers are universal machines, in the sense that they can simulate any other machine; instead of having one special machine to write email, and another special machine to read Wikipedia, and yet another special machine to find prime numbers, you can just have one single, general-purpose machine that can do all of these things. But this is a subject for another blog post!)
The obvious, naive way to create a factor machine is as follows:
- For each positive integer from up to :
- Try dividing by .
- If evenly divides , that is, the remainder is , then stop and output .
- If every from to is tried without finding one that evenly divides , then must be prime, so output .
This is called trial division. (Can you see why it will always return a prime divisor of ?) There are several ways this can be optimized, but we’ll get back to those later. For now let’s think about how long this takes.
When measuring how long an algorithm takes, it is useless to measure the actual running time, in seconds, on some particular computer. For one thing, the amount of time the algorithm takes to run is highly dependent on the particular computer executing it, what other programs are running on the same computer, the weather, time of day, etc., so it is quite difficult to compare to other algorithms. For another thing, the algorithm may take different amounts of time for different inputs. When mathematically analyzing how long an algorithm takes, what we really care about is how the running time scales as a function of the size of the input. This tells us something about how big the inputs can get before the algorithm takes an infeasibly long time to run.
So, what about trial division? The input is an integer ; the size of the input is the number of digits needed to write . (It is more typical to measure the size in terms of the number of bits needed to write it in binary, but it turns out that the base doesn’t really matter, so we’ll stick to base 10 for now.) Let be the number of digits needed to write , so , and . How many steps does the algorithm take to run? We do one division operation for each from up to , so it essentially takes time proportional to . Since , this means that the time needed to run scales exponentially with the size of the input: every time we add one more digit to the input, we multiply the time needed by a factor of .
At this point you may protest that the version of trial division I gave above is hopelessly naive; and indeed, we can optimize it. For example, we don’t have to try every : if we try dividing by first, after that point we can just try dividing by all the odd numbers up to , since if doesn’t divide then no other even number can either. This essentially halves the necessarily running time; but is still proportional to .
A more important optimization is to only test values of up to , instead of going all the way up to . If is the product of two divisors, , then one of them (say, ) must be , and the other must be . So if we haven’t found any divisors by the time we get to then we certainly aren’t going to find any above .
So now the algorithm only takes time proportional to , which seems like it is a big improvement. To be sure, it is an improvement; but if we relate it back to the number of digits , we find , so the running time still grows exponentially in the number of digits , just with a smaller base.
In general, exponential algorithms are fairly useless—for even modest-sized inputs, the running time can be astronomical. For example, suppose an algorithm takes exactly seconds to run on an input of size . When it takes seconds; when , seconds. So far, so good. For it takes a couple minutes ( seconds); for , it takes 15 minutes; for , a few hours; for , a whole day. By the time we get up to an input of size , the algorithm takes over 300 years. For an input of size , it takes over 30 million years. We only have to get up to size or so before the algorithm would take longer than the estimated age of the known universe.
In some sense, we don’t know how to do better than this for factor machines and factorization machines. There are factorization algorithms which are indeed much faster than trial division (with fancy names like the Generalized Number Field Sieve). Using such algorithms it’s feasible to factor numbers with up to hundreds of digits instead of just a few—but nevertheless, the running time of these algorithms still scales essentially exponentially with the size of the input. (There is actually quite a lot of technical detail I am sweeping under the rug here, but I don’t really want to get into it so I hope you will forgive me.)
Primality testing, on the other hand, is a completely different story! Trial division is the only algorithm most people know for primality testing, but over the next few posts I will explain a few other algorithms (and prove that they work!) which are much faster.