Hi! I am new to GPU programming, and I have been experimenting with Metal.jl
As practice, I wrote this little program to estimate pi using the Monte Carlo method.
using Metal
using Random
throw_dart(a) = (2*a - 1)^2
dart_hit_circle(a) = a <= 1
function est_pi_gpu(N)
darts = Metal.rand(N, 2)
darts = mapreduce(throw_dart, +, darts; dims=2)
hits = mapreduce(dart_hit_circle, +, darts)
return 4 * hits / N
end
function est_pi_cpu(N)
darts = Random.rand(N, 2)
darts = mapreduce(throw_dart, +, darts; dims=2)
hits = mapreduce(dart_hit_circle, +, darts)
return 4 * hits / N
end
dart_throw_count = 2^27
println("GPU Attempt:")
@time println(est_pi_gpu(dart_throw_count))
println("\nCPU Attempt:")
@time println(est_pi_cpu(dart_throw_count))
If I set dart_throw_count
to anything below 2^27, this works fine. In the first run I set dart_throw_count = 2^27 - 1
and it returns a proper pi estimate. But as soon as I set dart_throw_count = 2^27
or higher (The second run) it just returns 4
for the GPU
So only for the GPU the line hits = mapreduce(dart_hit_circle, +, darts)
returns the length of the array when the length is >=2^27
Does anyone have any ideas why this is happening?