The sum of the primes below 10 is 2+3+5+7=17.
Find the sum of all the primes below two million.
https://projecteuler.net/problem=10
Pretty easy (but slow) if I use my old algorithm. But let’s make it harder on me. Rather than modify my implementation of the Sieve of Eratosthenes, let’s do the Sieve of Atkin. Compared to Eratosthenes, Atkin is brand-spanking-new. The paper describing the algorithm was published in 2004. Eratosthenes’s and Atkin’s sieves are two millennia apart! Such a long time surely would have resulted in a big improvement, right? Let’s find out.
Wikipedia has an excellent description of Atkin’s algorithm. Let’s implement it one step at a time.
1. Create a results list, filled with 2, 3, and 5.
As soon as the word “list” shows up, it’s important to consider if the list should be implemented as an array or a linked list. These data structures have different time complexities for various operations such as accession, inserting, removing, and so on. The list of results will mostly be appended to, so let’s stick with an array.1Actually, Python’s list implementation is a dynamic array. See StackOverflow for a discussion.
results = [2, 3, 5]
Let’s stick that in a function with our limit as the argument so the code can be reused for future problems.
def sieve_of_atkins(limit): """Returns all prime numbers up to limit. The limit must be above 5.""" results = [2, 3, 5] # More to come... return results
2. Create a sieve list with an entry for each positive integer; all entries of this list should initially be marked non prime (composite).
Making a list filled with numbers up to n is easy enough, but how do we mark them composite or prime? I think a Python dictionary is the best suited for this purpose. It’ll be keyed on the integers with boolean values.
sieve = {i: False for i in range(1, limit)}
That’s a lot of entries. This means that the Sieve of Atkin’s space complexity is going to be at least \mathcal{O}(n). Sometimes, it’s wise to trade off space complexity for time complexity. We’ve got oodles of RAM these days, but the length of a second has remained constant.
3. For each entry number n in the sieve list, with modulo-sixty remainder r:
There goes that word: “for”. We need a for
loop that iterates though our entries. Step 3 also defines r, which is the remainder when n is divided by 60.
for number in sieve: remainder = number % 60 # More to come...
3a. If r is 1, 13, 17, 29, 37, 41, 49, or 53, flip the entry for each possible solution to 4x^2+y^2=n.
The if
-statement is easy enough to implement, but what does “each possible solution to 4x^2+y^2=n” mean? We need to flip the primality of the entry as many times as there are integer solution pairs of 4x^2+y^2=n. This sub-problem is equivalent to counting the lattice points of Quadrant I that lie on an ellipse. If there is an odd number of lattice points on the ellipse, the entry for n is flipped an odd amount of times, resulting in the entry ultimately being marked as prime.
This is a sub-problem that deserves its own post. Check out my post on finding lattice points on an ellipse, then come back for the rest. With the code imported from that post, Step 3a is implemented as:
if remainder in {1, 13, 17, 29, 37, 41, 49, 53}: number_of_integer_solutions = len(find_quadrant1_lattice_points_of_ellipse(number, 4, 1)) for i in range(number_of_integer_solutions): sieve[number] = not sieve[number]
3b. If r is 7, 19, 31, or 43, flip the entry for each possible solution to 3x^2+y^2=n.
This is very similar to Step 3a so the above if
-statement can be replaced with a few different numbers.
if remainder in {7, 19, 31, 43}: number_of_integer_solutions = len(find_quadrant1_lattice_points_of_ellipse(number, 3, 1)) for i in range(number_of_integer_solutions): sieve[number] = not sieve[number]
3c. If r is 11, 23, 47, or 59, flip the entry for each possible solution to 3x^2-y^2=n when x>y.
Again, pretty similar except that the curve is a hyperbola rather than an ellipse, necessitating a different function.
if remainder in {11, 23, 47, 59}: number_of_integer_solutions = len(find_quadrant1_lattice_points_of_hyperbola(number, 3, -1)) for i in range(number_of_integer_solutions): sieve[number] = not sieve[number]
3d. If r is something else, ignore it completely.
With great pleasure!
4. Start with the lowest number in the sieve list.
5. Take the next number in the sieve list still marked prime.
6. Include the number in the results list.
We’ll just iterate through the sieve
again, this time looking for True
s and saving them in results
. Dirty composite numbers will be passed over!
for number in sieve: if sieve[number] == False: continue else: results.append(number)
7. Square the number and mark all multiples of that square as non-prime.
The important consideration here is knowing when to stop, lest we get an KeyError
. We need to make sure that ki^2 < n. Solving for k shows that the maximum multiple will be \frac{n}{i^2}. Step 7 now can be implemented as another loop iterating through multiples of the square until the maximum multiple is hit.
square = number**2 max_multiple = math.floor(limit / square) for multiple in range(1, max_multiple + 1): sieve[multiple * square] = False
Here’s the complete function implementing the Sieve of Atkins (with some flaws, dependencies, and redundancies, but oh well).
def sieve_of_atkins(limit): """Returns all prime numbers up to limit. The limit must be above 5.""" results = [2, 3, 5] sieve = {i: False for i in range(1, limit)} for number in sieve: remainder = number % 60 # Step 3a if remainder in {1, 13, 17, 29, 37, 41, 49, 53}: number_of_integer_solutions = len(find_quadrant1_lattice_points_of_ellipse(number, 4, 1)) for i in range(number_of_integer_solutions): sieve[number] = not sieve[number] # Step 3b if remainder in {7, 19, 31, 43}: number_of_integer_solutions = len(find_quadrant1_lattice_points_of_ellipse(number, 3, 1)) for i in range(number_of_integer_solutions): sieve[number] = not sieve[number] # Step 3c if remainder in {11, 23, 47, 59}: number_of_integer_solutions = len(find_quadrant1_lattice_points_of_hyperbola(number, 3, -1)) for i in range(number_of_integer_solutions): sieve[number] = not sieve[number] # Steps 4-7 for number in sieve: if sieve[number] == False: continue else: results.append(number) square = number**2 max_multiple = math.floor(limit / square) for multiple in range(1, max_multiple + 1): sieve[multiple * square] = False return results
That being done, let’s actually return to the problem. The sum of all primes below 2 million is now just one-liner away:
>>> sum(sieve_of_atkins(2000000)) 142913828922