Fast Generation of Arrays of Prime Numbers

BTW: I switched the topic name - I hadn’t really meant to make this about Mojo even if I gave it top billing.

1 Like

Rather than having multiple threads sieving in different ranges, you can just have the prime sieve itself be parallel (either by sweeping the same range with different primes or sweeping separate ranges). The difference is the prime sieve will (theoretically) know the optimal way of distributing the work for the internals of the sieve. For example, rather than using regions of size 10^8, you probably want to use regions that are tuned to fit the CPU caches (taking into account the compression that the sieve is using)

1 Like

Yes, I think that makes sense to me.
Not to presume, but are you telling me this is an improvement you are willing to incorporate soon? If so, I can’t wait to start modifying my code!

I plan on merging faster prime sieve part 1 by oscardssmith · Pull Request #120 · JuliaMath/Primes.jl · GitHub shortly which will is the start of improving this (and by itself is a ~13% speed increase). The next step will probably be to add the iterator which will save you roughly 1200 hours of compute time since you won’t have to repeatedly sieve the small primes. I probably won’t have time for most of the other improvements in the short term (this is a project I maintain for fun when I have the time to), but I would be happy to help you (or anyone else) that wants to implement further improvements.

5 Likes

That is wonderful news! I’ll take 1200 hours. And I hope that I might one day have some ability to implement improvements for the community, I’m not there yet (or I don’t think so, maybe its not as hard as it might seem on the outside).

One relatively easy way for you to help even if you’re fairly new is to review the code that I’m writing as an extra set of eyes can be very useful in ensuring bugs don’t slip in. Other than that, some of the needed improvements are pretty simple and I would be glad to help you with them if you are interested. The techniques we should be using are well documented, so as long as you have some number theory and a bit of Julia experience you should be set.

1 Like

I’d be pleased to attempt to help! I’m out of town this week, but next week I’ll be around and working.
BTW: I appreciate what you and others do to make these packages. I think my earlier posts don’t properly reflect that appreciation.

5 Likes

Just out of curiousity, what kind of times are you getting for primes up to 10^8, as well as for the subsequent 10^8->2(10^8), etc? On my computer just running the regular primes.jl function, i get:

@time primes(10^8); 0.778268 seconds (5 allocations: 69.530 MiB, 2.08% gc time)
@time primes(10^9); 9.403250 seconds (5 allocations: 643.503 MiB, 8.93% gc time)

I’ve researched/experimented with prime number generation for a while now, and i’ve written my own functions to sieve primes that is much faster and using less memory. (i realize that other sievers exist that are also quicker than the primes.jl function)
Here’s my best stats so far. (All primes >= 5)
10^ 8 0.321051 seconds (1.23 k allocations: 49.992 MiB, 2.49% gc time)
10^ 9 5.098315 seconds (3.40 k allocations: 447.748 MiB, 0.31% gc time)
And im still discovering new ways to shorten the durations, while using less memory, but it’s a process.

I get similar results with Primes.jl for 10^9. My difficulty rests in seeking values 10^7 * 10^9, which requires iteration over ranges up to target of 10^16. My friends here suggest having a generator offering one prime at a time will help dramatically, and I’m inclined to agree it will. (As an aside, for now, nothing has been happening with my compute since I upgraded to Julia 1.9. I’m having an executable path error that I’m having trouble fixing.)
Thanks for your interest -
Tad

I don’t think sieving is necessarily the best approach for this kind of task, especially if you want to parallelize the computations. Instead I would suggest to look into using a probabilistic primality test like Miller-Rabin or BPSW. They are very easy to implement, and e.g. for primes less than 2^{64} they can easily be made deterministic, too (i.e. won’t accept any non-primes). Of course doing this is slower per prime than a sieve, but only if one ignores things like storage requirements and parallelization. Indeed, unlike a sieve, it is trivial to use this across many threads, cores, CPUs or even machines.

Apparently version 0.5.4 of Primes.jl even implements Miller-Rabin, BPSW and more. Unfortunately its documentation for isprime does not make any promises about whether it is accurate in certain ranges. But looking at its code, I think that the test is deterministic for at least all n<2^{64}.

2 Likes

Interesting point! That coupled with the earlier suggestion of a prime by prime generator should offer my process a substantial speed up. Thank you for your interest in my project.

Using isprime will likely be a lot slower when you need to generate a large number of primes.

That depends. If you want to generate a lot of 4000 bit primes, a Sieve will be useless, and using isprime will be infinitely faster (and is essentially the state of the art for generating such primes).

If you want to search all primes in a certain range, and that range is not too big, you should use a prime sieve.

If you want to search in a range that is beyond sieving on “regular” machines (as seems to be the case), well, either you are out of luck, or for a “moderate” range (as, again, seems to be the case here), then using isprime still may be viable, because at least your machine can do something besides swapping data between RAM and disk drive. Plus, it can be parallelized across cores and machines, so is ideal for cluster computations.

4 Likes