Generators vs loops vs broadcasting: Calculate PI via Monte Carlo Sampling

For small N, you can build a data structure that informs the compiler the size of what you are doing, and the function will specialize to that size, possibly being very fast and non-allocating:

# original
julia> function f(N)
           r = rand(N,2)
           cnt = count(r[:,1].^2 + r[:,2].^2 .<= 1)
           pi = cnt/N * 4
       end
f (generic function with 2 methods)

julia> @btime f(10)
  285.348 ns (8 allocations: 1.14 KiB)
4.0

# now the trick
julia> using StaticArrays

julia> struct MyInteger{N}
         i::Int
       end

julia> function f(n::MyInteger{N}) where N
         r = rand(SMatrix{N,2,Float64})
         cnt = count(r[:,1].^2 + r[:,2].^2 .<= 1)
         pi = cnt/N * 4
       end
f (generic function with 2 methods)

julia> n = MyInteger{10}(10)
MyInteger{10}(10)

julia> @btime f($n)
  43.885 ns (0 allocations: 0 bytes)
2.4

For large N the good thing about writing loops is that you can parallelize them:

julia> using FLoops

julia> function f(N)
           @floop for i in 1:N
               if rand()^2 + rand()^2 <= 1
                   @reduce(cnt += 1)
               end
           end
           pi = cnt/N * 4
       end
f (generic function with 1 method)

julia> @btime f(100000)
  317.370 μs (111 allocations: 5.75 KiB)
3.13936