I know nothing of computational theory, if not that sometimes you come across to some “big O” notation to rank algorithm efficiency.
How do I compute this “big O” notation for the two implementations of prime number generators above? I have lists that are not dense and break conditions, so how do I account for these factors in computing the big O ?
I was thinking that the first one was more efficient but it ends up they are similar…
function primes(x)
list = Int64[]
for i in 2:x
prime = true
for j in list
if i%j == 0
prime = false
break
end
end
if prime push!(list,i); end
end
return list
end
function primesRec(x)
if x == 1 return Int64[]; end
primes = primesRec(x-1)
prime = true
for i in primes
if x%i == 0
prime = false
break
end
end
if prime push!(primes,x); end
return primes
end
The very basic idea is that small problems are usually fast to solve anyway, so the interesting part is how the complexity grows as the problem grows. If we as here have that the complexity grows with x, the upper bound for all primes we want to find, then we want to know which term in the complexity expression will dominate.
Here the most expensive things are probably allocations (can happen in push! in outer loop, so no more than x-1 times), and modulus (happens in inner loop, so \sum_i^x\pi(i) where \pi(i) is the prime counting function.
Both algorithms do the same amount of pushes and it will be less than x reallocations (though larger reallocations might be more expensive, but lets ignore that for now).
Both algorithms do \sum_i^x\pi(i) number of modulus operations. Since \pi(i)<i this expression can be upper bounded by \sum_i^x(i-1)=x*(x-1)/2 where the x^2 term will be the dominant one as x grows. We could also be smarter here and use the prime number theorem to say that \sum_i^x\pi(i)\approx\sum_i^xi/\log(i) but then it becomes harder to simplify that further (or I am too lazy to at least).
So I would say both algorithms are O(x^2), and you probably can put a better bound on them with a little more work. Though that does not mean they should run with the same speed, just that the asymptotic growth will be similar.
Some simple things to make them more efficient could be to not loop through all the prime list every time, you only need to loop up to where the prime squared is larger than the number you check. You could also use sizehint! to tell julia what the expected size of the list is so there would be less reallocations, number of primes below x are approximately x/\log(x) so you could use something like that as the initial hint. My guess is also that since you want the full list of primes, a sieve of eratosthenes would be more efficient in speed, even though the big O complexity might be similar, since you won’t do as much modulus which is a costly operation.