Iterating over primes

JADekker · April 30, 2025, 8:01am

In Project Euler, prime numbers come up quite a lot, but sometimes it is not clear how many prime numbers are needed. To avoid duplicate calculations, and as a challenge to get a bit more familiar with iterators in julia, I’ve written a struct that computes the next prime number as it is needed and then internally stores the vector of all primes so far, such that these calculations do not need to happen more than once. I’m aware that growing vectors with push! is not very nice, is there another more efficient way (in base julia!) to do what I’m doing? (Purely to help me become more aware of iterator practices, I’m aware of Primes.jl and the likes).

using BenchmarkTools

struct Primes
    primes_vec::Vector{Int}
end
Primes() = Primes([2,3])
function Base.iterate(P::Primes, state=1) 
    if state > length(P.primes_vec) 
        next_prime!(P)
    end
    return (P.primes_vec[state], state+1)
end
function isprime(n, primes::Primes)
    for p in primes
        p > floor(Int, sqrt(n)) && return true
        n % p == 0 && return false
    end
end

""" Assumes that primes_vec contains the first primes (at least 2 and 3) and adds the next prime by a simple sieve. """
function next_prime!(primes::Primes)
    n = primes.primes_vec[end] + 2
    while true
        if isprime(n, primes)
            push!(primes.primes_vec, n)
            return
        end
        n += 2
    end
end

function test_primes()
    P = Primes()
    @time sum(Iterators.take(P, 100_000)) # Here, we need to compute the first 100_000 primes
    @time sum(Iterators.take(P, 100_000)) # Here, we can re-use them from memory. 
end
test_primes()

Gives

  0.062785 seconds (10 allocations: 1.828 MiB)
  0.000066 seconds

DNF · April 30, 2025, 8:14am

The nextprime function works more or less like an iterator: Functions · Primes.jl I think it avoids storing anything.

Even if you don’t want to depend on Primes.jl, you could study it.

JADekker · April 30, 2025, 8:27am

I see, thank you! The way I understood that one is that it would recompute the primes already computed so far (as it doesn’t store anything). My point was to try and avoid unnecessary recomputations. (I’m aware that the computations are cheap here, but let’s imagine I’ll do the same with a different kind of sequence that is computationally expensive to compute)

DNF · April 30, 2025, 8:40am

I don’t know that much about primes, but nextprime calls the isprime function, which does not need to compute all the previous primes.

Prime numbers are probably the most heavily studied subject in all of mathematics, so I expect there are advanced methods for determining primality.

DNF · April 30, 2025, 8:45am

…but if your main goal is to investigate iterators and accumulating stored intermediate results, then, firstly, I would point out that push! is not so inefficient. It preallocates more space than you ask for to avoid resizing on each step. You can also use sizehint! if you have a good hunch how much space you will eventually need (for example if you want to calculate 100_000 primes, you know how much space to set aside.)

Concerning primes again:

This one is dangerous, and can give wrong answers, due to floating point issues. Use isqrt instead. Prime calculations inherently deal with integers, so you should avoid going through float operations.

nsajko · April 30, 2025, 8:45am

This is an interesting kind of iterator, in that it’s a stateful iterator, but only as a performance optimization!

I don’t think you’re doing anything wrong, but here are two slightly different approaches you could have taken:

Make it a purely stateless iterator, by storing the already computed primes in the iteration state (the second return value).
Lock when updating the iterator, to be concurrency-safe.

nsajko · April 30, 2025, 8:50am

AFAIK, once one’s in bigint land, using a sieve (so storing all primes) is much faster than using nextprime repeatedly.

JADekker · April 30, 2025, 8:59am

I see the approach to working purely stateless, but my point is that I wish to re-use the same iterator at a later point and benefit from the work already done

e.g. see how

@time sum(Iterators.take(P, 100_000)) # Here, we need to compute the first 100_000 primes
@time sum(Iterators.take(P, 100_000)) # Here, we can re-use them from memory.

produces the following, where the second run is re-using the work done in the first run

  0.062785 seconds (10 allocations: 1.828 MiB)
  0.000066 seconds

JADekker · April 30, 2025, 9:01am

Thanks, I didn’t think about that!

Sevi · April 30, 2025, 9:44am

If you work with primes a lot and also want save time between Julia sessions, you could also push the computation of primes to the compilation step and keep a “global” list of primes around. The list gets generated once when your package is compiled and could then be used (and expanded) during a session.

I believe this is what base Julia is doing for combinatorial factors up to a certain size:

github.com/JuliaLang/julia

base/combinatorics.jl

1f1839129


      
          const _fact_table64 = let _fact_table64 = Vector{Int64}(undef, 20)
              _fact_table64[1] = 1
              for n in 2:20
                  _fact_table64[n] = _fact_table64[n-1] * n
              end
              Tuple(_fact_table64)
          end
          
          const _fact_table128 = let _fact_table128 = Vector{UInt128}(undef, 34)
              _fact_table128[1] = 1
              for n in 2:34
                  _fact_table128[n] = _fact_table128[n-1] * n
              end
              Tuple(_fact_table128)
          end

sgaure · April 30, 2025, 10:28am

Or faster, p^2 > n, unless overflow is an issue.

You can make a stateless base iterator, and make it stateful via Iterators.Stateful.

Oscar_Smith · April 30, 2025, 2:47pm

As one of the main developers of Primes.jl the basic things you are missing for performance are as follows:

Fast isprime checking (look up Baillie–PSW or Miller-Rabin). For an n bit number, these methods are ~O(n^3) as opposed to O(2^n) of trial division
Sieve methods rather than spending time testing whether each number is prime individually, we can compute all the primes up to n in roughly O(n) operations.

Topic		Replies	Views
Sieve of Eratosthenes iterator New to Julia python , generator , iterators , julia	1	149	October 20, 2024
Fast Generation of Arrays of Prime Numbers Offtopic	32	1325	October 4, 2023
The prevprime and nextprime functions New to Julia question	7	1845	June 10, 2017
Prime number hunting New to Julia	2	758	November 27, 2023
Improving performance in checking prime numbers General Usage	14	1850	March 2, 2021

Iterating over primes

Related topics