I am absolutely sure something like this has been asked before but I cannot for the life of me find it.

I’m currently working on a memory- and speed-efficient way to access the first 1 billion primes as part of a different project. The way I’m using is a 1.4GB Bitarray, which I then use findall to convert back to numbers.

This is acceptably fast (~20 seconds) for my purposes, EXCEPT I need to manually add the number 2 back to the front of the array, which more than quadruples the run time.

Since I know exactly how big the array I need is, is there any way to do something where findall stores its answer directly into a pre-allocated array, so that I don’t need to do 1 billion copy operations?

My current code:

const last_prime = 22801763489
function get_primes()
# Acquire primes as Bitarray
io = open("primes_as_bits.bitarray")
list = BitArray(undef, (Int((last_prime - 1) / 2)))
read!(io, list)
close(io)
# Convert Bitarray into Int[]
final = [2]
push!(final, (findall(x -> x == 1, list) .* 2 .+ 1)...)
return final
end

P.S. I’ve got a threaded version as well, which is a little faster, but I feel like it would still be much more efficient if I could just solve this problem.

First of all, you should use pushfirst! instead of push!. Most of the extra time comes from splatting. Secondly, you can just use a loop to avoid the extra allocations.

Do you mean loop through findnext? I think I tried that and it was still slower, but I’ll get an actual benchmark here in a second. I assumed findall had some hidden optimizations under the hood

Also, thank you. I have like 5 versions I’ve tried and pushfirst was among them. That was the one I was actually citing when I said “4x slower”

function get_primes()
# Acquire primes as Bitarray
io = open("primes_as_bits.bitarray")
list = BitArray(undef, (Int((last_prime - 1) / 2)))
read!(io, list)
close(io)
# Convert Bitarray into Int[]
final = Vector{Int}(undef, n)
final[1] = 2
i = 1;
for j in 2:length(list)
if list[j]
final[i] = list[j]*2 + 1
i += 1
end
end
return final
end

I appreciate it, but I only need this to be less than 30 seconds. This isn’t the project. The next set of calculations is going to take hours. I just needed this to be fast enough to be able to monitor the first step of the next thing without losing my sanity

Interesting! I think you might be able to compute this efficiently from the bitmap directly. Specifically, I believe that this can be viewed as a convolution-like operation over the reshaped bitmap.

The idea is that if you define mask to be an n by k bitarray, then array[i, j] = dot(mask[i, :], mask[j, :]). I definitely don’t have time to work on this, but it is really cool.