Skip to content

Instantly share code, notes, and snippets.

@amacgillivray
Last active January 12, 2025 05:50
Show Gist options
  • Save amacgillivray/c37a2a860df5117561ec64604bcb6b7d to your computer and use it in GitHub Desktop.
Save amacgillivray/c37a2a860df5117561ec64604bcb6b7d to your computer and use it in GitHub Desktop.
Fast Perfect Number Solving with Euclid-Euler and Miller-Rabin

Solving Perfect Numbers Quickly with Haskell

Walking through the stages of optimizing a perfect-number algorithm; starting with a naive approach, rewriting to use the Euclid-Euler theorem, and then leveraging the Miller-Rabin primality test.

A Frustratingly Naive Solution

On a dark and stormy night in Kansas, many years ago, I sat in front of my laptop - face lit only by the screen.

A homework assignment had tasked us with writing a basic haskell function to find every perfect number between 1 and 9000. We weren't studying numbers, nor did we discuss perfect numbers in class. The number "9000" was chosen as the upper limit due to the complexity associated with finding perfects using typical methods.

I had already completed the rest of the assignment - writing other simple haskell functions - and just finished implementing this code:

perfects :: Int -> [Int]
perfects n = [x | x <- [1..n], x == sum (filter (\y -> x `mod` y == 0) [1..x-1])]

main = do print (perfects 9000)

It worked! It generated the correct list. I was done, as far as the assignment went.

Yet, something - boredom, curiosity, a lack of things to do - made me want to keep tinkering. After all, I had been doing a lot of research on quantum computing and HPC, and this seemed like a perfect opportunity to try my hand at optimizing a (comparatively comprehensible) math problem. We'd discussed in class how Haskell was useful for scientific computing. The benefits of the way it handled integers. If all of that were true, why did it take 13.5 seconds to find four measly values?

The code was effectively a literal translation of the set of perfect numbers, $P$, relying on the definition of a perfect number as a number equal to its aliquot sum:

$$ P = \{ \space n \in \mathbb{N} \space | \space n =\sum_{d|n,\space d\neq n} d \space \} $$

Optimization 1: Euclid, Euler, and Mersenne

Like any good aspiring academic, my first step was to see if there were any giants whose shoulders I could stand on. A quick google search introduced me to the Euclid-Euler theorem relating perfect numbers and Mersenne Primes.

The Euclid-Euler Theorem:

$\text{If } \boldsymbol{2^{p+1}-1} \text{ is prime, then } \boldsymbol{2^p(2^{p+1}-1)} \text{ is perfect.}$

Aha! I can work with that. The theorem offers an obvious optimization; Instead of calculating the aliquot sum $\sum_{d|n, d\neq n} d$ for every number between 1 and $n$, we can:

  • find each number of the form $2^{n}-1$
  • check its primality, and
  • if $2^n-1$ is prime, multiply it by $2^{n-1}$ to produce the corresponding perfect number.

This will definitely be faster. However, we've done something that mathematicians might call controversial: multiplying by (powers of) 2. By definition, every perfect number this algorithm finds is going to be even - which might be a problem, since the existence of odd perfect numbers hasn't been disproven, although none have ever been found.

To account for this, we'll define a new set, $P_E$, of the even perfect numbers. Mathematically,

  • $P_E \subset P$ if odd perfect numbers exist,
  • $P_E = P$ if not.

Thus, $P_E \subseteq P$, but we're going to assume $P_E = P$ to optimize. Let's define it:

Where the set of Mersenne numbers, $M$, is

$$ M = \{ \space 2^n - 1 \space | \space n \in \mathbb{N} \space \} $$

And the set of Mersenne Primes, $M_p \subset M$, is

$$ M_P = \{ \space p \in M \space | \space p \text{ is prime }\} $$

The set of Even Perfect Numbers, $P_E \subseteq P$, can thus be written as:

$$ P_E = \{ \space 2^{n-1} \times p \space | \space p = 2^n-1 \in M_P \} $$

The optimized code will generate the list of numbers in $P_E$ up to a specific limit value.

Resulting Optimization In Code

To start, we need to figure out a primality test. For now, the basic approach - checking if dividing $n$ by any number $2 \leq x \leq \sqrt{n}$ leaves a remainder of 0 - will suffice.

The sqrt function in Haskell yields a float value, so we define sqrt' to yield an Int instead:

sqrt' :: Int -> Int
sqrt' = floor . sqrt . fromIntegral

Then, we define the method to check primality:

-- Determines primality; not prime if any modulo (2..sqrt' n) == 0
primality :: Int -> Bool
primality n = if n > 1 then null [ x | x <- [2..sqrt' n], n `mod` x == 0] else False

Now, we want to work with mersenne numbers, which are based on powers of 2. It's helpful to have both the exponent and the actual value (e.g., [8, 3], where 3 is the power and 8 = $2^3$) mapped together in a tuple. The following function generates such pairs:

genpow2 :: Int -> [(Int, Int)]
genpow2 n = [(x, y) | y <- [2..n], x <- [2^y]]

To generate the list of mersenne primes, we check the primality of the first value in the tuple ($2^n$), minus 1:

genprimes :: Int -> [(Int, Int)]
genprimes n = filter (\x -> if fst(x)-1 < 4 then True else primality (fst(x)-1) == True) (genpow2 n)

And finally, applying the Euclid-Euler theorem, we can build the list of even perfect numbers from the list of Mersenne Primes:

perfects :: Int -> [Int]
perfects n = [(fst(x)-1)*(2^(snd(x)-1)) | x <- genprimes n]

Changed Semantics of "Perfects" $n$ Parameter

The algorithm now treats the perfects $n$ argument as the maximum power of two to use when searching for Mersenne primes. The old implementation called perfects 9000, which (in the revised implementation) would check for members of $M_P$ up to $2^{9001}-1$.

Benchmarking runs were performed with the arguments 7 and 32:

  • perfects 7 finds even perfect numbers up to $2^7(2^{8}-1) = 32,640$, or just enough to cover the first 4, providing a nice benchmark against our original algorithm (which also found the first 4).
  • perfects 32 finds every even perfect number below approx. $9.2 \times 10^{18}$ (or 9.2 quintillion), making it just large enough to find the 8th perfect number (approx. $2.3 \times 10^{18}$).
  • No larger values found additional even perfect numbers within a reasonable runtime.

Altered main:

main = print (perfects 32)

Performance

- Original Euclids 7 Euclids 32
Run Command perfects 9000 perfects 7 perfects 32
No. Perfects Found 4 4 8
Largest Found 8128 8128 2305843008139952128
Compute Time 13,500ms 364ms 412ms

Our new approach solves the first 4 perfect numbers ("Euclids 7" run) in just 2.6% of the time taken by our naive method. Beyond the baseline comparison, the algorithm can find the first 8 perfect numbers in almost the same amount of time it takes to solve the first 4 ("Euclids 32")!

Again, all of the savings come from using the Euclid-Euler theorem, which allows us to avoid computing sums (implicitly assuming that $P_E = P$ within our specified range). To give an idea of how powerful the theorem is, Euler himself was able to apply it to calculate the 8th perfect number - shown under "Euclids 32", above - by hand way back in 1732... But, if he could use the theorem find 2305843008139952128 by hand, shouldn't we expect our computers to go a little further?

Optimization 2: The Miller-Rabin Primality Test

In the original/naive implementation, the complexity lay in finding divisors and computing aliquot sums. In the first optimization, we removed such computations altogether, with complexity instead bound by our primality test, where took the modulus of $n$ by every number from $2$ to $\sqrt{n}$. That worked fine for "small" numbers, like 2305843008139952128. But if we wanted to handle truly large numbers with arcane names like "tredecillion" and "centillion", we needed a better way.

With no ideas on how to optimize this on my own, I again turned to google - and quickly found the Miller-Rabin test (or Wikipedia). Searching for it specifically, I found a ready-made haskell implementation of the test on wiki.haskell.org/testing_primality.

About Miller-Rabin

For mathematicians: I risk embarrassing myself if I try to explain the concept's of the algorithm. Instead, go forth and stand on the shoulders of the taller giants. Here is Miller's original paper, here is Rabin's extension, and here is a neat Veritasium video on the mathematics of knots. Shoo!

For my fellow mere mortals: the key trait of the Miller-Rabin test is that it is probabilistic, meaning that there is a chance for incorrect outputs. Being probabilistic allows it to identify primes with high confidence more quickly than known deterministic algorithms.

While it will never generate a false negatives (identifying a prime number as a composite), the Miller-Rabin test can produce false positives (identifying a composite number as a prime). Thus, when we use it in our algorithm, we know that it will correctly identify every mersenne prime (and thus find every even perfect number), however, there is a small chance that it classifies a non-prime Mersenne number incorrectly, which would result in a "fake" perfect number in our output.

To prevent errors, the test is run one or more times using "witness" numbers. A good explanation of how/why witnesses suggest primality, and how to choose them, is provided on the Wikipedia page for Fermat's little theorem. In my code, I run the algorithm with an arbitrarily chosen non-random list of witnesses, pulled from sets shown to be sufficient up to certain thresholds. Using that witness set, no false positives were encountered when checking primality of all Mersenne numbers up to $2^{2281}-1$.

Incorporating in Code

Putting Miller-Rabin to use is easier than understanding it.

To incorporate it into our code, we:

  • copy and paste the wiki.haskell.org implementation above our existing code
  • Remove our sqrt' and primality functions
  • Rewrite genprimes to check primality using Miller Rabin and our list of witnesses

The code is shown below:

-- https://wiki.haskell.org/Testing_primality#Miller-Rabin_Primality_Test
-- Place wiki.haskell.org code here: 
-- millerRabinPrimality :: Integer -> Integer -> Bool

-- Our code: 
genpow2 :: Integer -> [(Integer, Integer)]
genpow2 n = [(x, y) | y <- [2..n], x <- [2^y]]

genprimes :: Integer -> [(Integer, Integer)]
genprimes n = filter 
    (\x -> 
        if fst(x)-1 < 4 
        then True 
        else all (== True) (map (millerRabinPrimality (fst(x)-1)) (filter (<= snd(x)) [2,3,5,7,11,13,17,31,73]))
    ) (genpow2 n)

perfects :: Integer -> [Integer]
perfects n = [(fst(x)-1)*(2^(snd(x)-1)) | x <- genprimes n]

-- Very large values can be solved quickly
--    platform: mobile Intel i7 (8th gen); 16GB DDR4
-- <= 256  solves in ~0.5s 
-- <= 512  solves in ~5.0s 
-- <= 1024 solves in ~50s
-- <= 2048 solves in ~500s
-- 
--   ...isn't that a beautiful pattern? 
--   increasing n by 2x increases runtime by ~10x
-- 
-- To find n perfect numbers, use p from the nth perfect number on this list: 
-- https://www.vaxasoftware.com/doc_eduen/mat/numperfe_eng.pdf
main = print (perfects 256)

...Holy $%@!

- Original Euclids Miller-Rabin (256) Miller-Rabin (1024) Miller Rabin (1279) Miller Rabin (2281)
No. Perfects Found 4 8 12 14 15 17
Largest (Digits) 4 19 77 366 770 1373
Compute Time (ms) 13,500 412 534 47,684 107,975 890,929

Take that, Euler!

We've successfully walked from our naive algorithm, where solving for large perfect numbers was intractable, to a smarter implementation that can quickly and accurately find perfect numbers with hundreds of digits. We've now thoroughly beaten Euler's by-hand calculations, and gone far beyond the scope of our original homework assignment.

And we didn't even have to worry about integer overflows! Running the Miller-Rabin implementation with perfects 2281 yields all 1373 digits of the 17th perfect number (first found in 1952). That's ~68x longer than the maximum size (in digits) of a C++ unsigned long long, and we never hit an overflow or imported a bigmath library. Praise $\lambda$!

So, what did we learn?

Well, first and foremost, we should note that our approach here isn't exactly unique. We've glued two well-known and highly related concepts together.

However, in doing so, we've taken a good look at algorithmic optimization techniques. The relative complexity of our first two algorithms is visible in the set definitions we used as guides, where we eliminated a sum operation:

$$ P = \{\space n \in \mathbb{N} \space | \space n =\sum_{d|n,\space d\neq n} d \space \} $$ $$ P_E = \{ \space 2^{n-1} \times p \space | \space p = 2^n-1 \in M_P \} $$

While the Miller-Rabin test is more difficult to understand, the reduction in operations compared to our naive primality test is clear, demonstrated by the subsequent speedup of calculations. The test also introduces us to important concepts like probabilistic algorithms and Fermat's little theorem (aka the driving force behind RSA public-key encryption). Both of these concepts, as well as Miller's original primality test from 1976, have a direct relation to Shor's algorithm for factoring large numbers using quantum computers. The fact that these concepts protect the world's data makes all of this feel startlingly useful, despite originally appearing (to students like myself) as abstract number theory.

And that's really the gist: walking through these techniques helps advanced concepts and techniques feel within reach. It's a very pure and understandable toy problem.

A Final Exercise for the Reader

If you read up on the list of known Perfect Numbers and their corresponding Mersenne Primes, you'll see that most use the Lucas-Lehmer Test (LLT), a specialized deterministic primality test for Mersenne numbers. The test would likely provide yet another speedup compared to the miller-rabin solution provided here -- can you implement it?

{-|
Original / Naive Approach
-}
perfects :: Int -> [Int]
perfects n = [x | x <- [1..n], x == sum (filter (\y -> x `mod` y == 0) [1..x-1])]
main = do print (perfects 9000)
{-|
Euclid-Based Solution (Naive Primality Testing)
-}
sqrt' :: Int -> Int
sqrt' = floor . sqrt . fromIntegral
primality :: Int -> Bool
primality n = if n > 1 then null [ x | x <- [2..sqrt' n], n `mod` x == 0] else False
genpow2 :: Int -> [(Int, Int)]
genpow2 n = [(x, y) | y <- [2..n], x <- [2^y]]
genprimes :: Int -> [(Int, Int)]
genprimes n = filter (\x -> if fst(x)-1 < 4 then True else primality (fst(x)-1) == True) (genpow2 n)
perfects :: Int -> [Int]
perfects n = [(fst(x)-1)*(2^(snd(x)-1)) | x <- genprimes n]
main = print (perfects 32)
{-|
Euclid-Based Solution with Miller-Rabin Primality Testing
--
I am not the author of the miller-rabin implementation below,
it comes from https://wiki.haskell.org/Testing_primality
-}
-- (eq. to) find2km (2^k * n) = (k,n)
find2km :: Integral a => a -> (a,a)
find2km n = f 0 n
where
f k m
| r == 1 = (k,m)
| otherwise = f (k+1) q
where (q,r) = quotRem m 2
-- n is the number to test; a is the (presumably randomly chosen) witness
millerRabinPrimality :: Integer -> Integer -> Bool
millerRabinPrimality n a
| a <= 1 || a >= n-1 =
error $ "millerRabinPrimality: a out of range ("
++ show a ++ " for "++ show n ++ ")"
| n < 2 = False
| even n = False
| b0 == 1 || b0 == n' = True
| otherwise = iter (tail b)
where
n' = n-1
(k,m) = find2km n'
b0 = powMod n a m
b = take (fromIntegral k) $ iterate (squareMod n) b0
iter [] = False
iter (x:xs)
| x == 1 = False
| x == n' = True
| otherwise = iter xs
-- (eq. to) pow' (*) (^2) n k = n^k
pow' :: (Num a, Integral b) => (a->a->a) -> (a->a) -> a -> b -> a
pow' _ _ _ 0 = 1
pow' mul sq x' n' = f x' n' 1
where
f x n y
| n == 1 = x `mul` y
| r == 0 = f x2 q y
| otherwise = f x2 q (x `mul` y)
where
(q,r) = quotRem n 2
x2 = sq x
mulMod :: Integral a => a -> a -> a -> a
mulMod a b c = (b * c) `mod` a
squareMod :: Integral a => a -> a -> a
squareMod a b = (b * b) `rem` a
-- (eq. to) powMod m n k = n^k `mod` m
powMod :: Integral a => a -> a -> a -> a
powMod m = pow' (mulMod m) (squareMod m)
{-| START MY CODE -}
genpow2 :: Integer -> [(Integer, Integer)]
genpow2 n = [(x, y) | y <- [2..n], x <- [2^y]]
genprimes :: Integer -> [(Integer, Integer)]
genprimes n = filter
(\x ->
if fst(x)-1 < 4
then True
else all (== True) (map (millerRabinPrimality (fst(x)-1)) (filter (<= snd(x)) [2,3,5,7,11,13,17,31,73]))
) (genpow2 n)
perfects :: Integer -> [Integer]
perfects n = [(fst(x)-1)*(2^(snd(x)-1)) | x <- genprimes n]
main = print (perfects 256)

Comments are disabled for this gist.