Dear all, I’m trying to solve some toy problems in Julia and benchmark them since there often are many different ways to go. I have a strong Matlab background therefore it feels a bit weird to go back to simple loops (and I would not prefer to do that) due to their verbosity. But in case you tell me that’s the recommended way I’ll give it a try
However, here are 9 different approaches to calculate Pi via Monte Carlo sampling (see BasicProblems (ucidatascienceinitiative.github.io)) and I’d be extremely happy if you can point out the best version or propose your best one to solve that issue.
problem4 = (
function(N) # 1.039 ms (15 allocations: 5.36 MiB)
r = rand(N,2)
cnt = count(r[:,1].^2 + r[:,2].^2 .<= 1) # inside the circle
pi = cnt/N * 4
end
,
function(N) # 1.167 ms (15 allocations: 5.36 MiB)
r = rand(2,N)
cnt = count(r[1,:].^2 + r[2,:].^2 .<= 1)
pi = cnt/N * 4
end
,
function(N) # 3.456 ms (7 allocations: 5.36 MiB)
r = rand(2,N)
cnt = count(norm.(eachcol(r)) .<= 1)
pi = cnt/N * 4
end
,
function(N) # 3.381 ms (7 allocations: 5.36 MiB)
r = rand(N,2)
cnt = count(norm.(eachrow(r)) .<= 1)
pi = cnt/N * 4
end
,
function(N) # 2.568 ms (9 allocations: 3.83 MiB)
r = rand(2,N)
cnt = count(sum(abs2.(r), dims=1) .<= 1)
pi = cnt/N * 4
end
,
function(N) # 1.383 ms (13 allocations: 3.83 MiB)
r = rand(N,2)
cnt = count(sum(abs2.(r), dims=2) .<= 1)
pi = cnt/N * 4
end
,
function(N) # 20.271 ms (200000 allocations: 18.31 MiB)
cnt = count(sum(rand(2).^2) <= 1 for i in 1:N )
pi = cnt/N * 4
end
,
function(N) # 9.725 ms (100000 allocations: 9.16 MiB)
cnt = count(i->norm(rand(2)) <= 1, 1:N) # no difference to count(norm(rand(2)) <= 1 for i in 1:N )
pi = cnt/N * 4
end
,
function(N) # 14.534 ms (100000 allocations: 9.16 MiB)
cnt = 0;
for i in 1:N
if norm(rand(2)) <= 1
cnt += 1
end
end
pi = cnt/N * 4
end
,
)
using Random
using BenchmarkTools
using LinearAlgebra
n = 100000
res = []
for f in problem4
Random.seed!(0)
push!(res, f(n))
@btime $f($n);
end
res
I’m quite unsure if I should watch for the total time or the numbers of allocations. Since it’s clear that preallocation will come to some limits for high numbers of N
, I’m looking for iterative solutions, but all generator or explicit loop based solutions are slower (in this example).
Which one would you choose, based on performance and (imo also very important) readability?
Thanks,
Jan