Just for the sake of didactics:
The “natural” way to solve that problem is probably:
julia> function meandist(n)
d = 0.
for i in 1:n
x = rand(2)
y = rand(2)
d += sqrt((x[1]-y[1])^2+(x[2]-y[2])^2)
end
return d/n
end
julia> @btime meandist(1000)
88.329 μs (2000 allocations: 187.50 KiB)
0.5351316591605562
This is quite slow and allocates a lot, because of the allocations of x
and y
random vectors at each iteration of the loop. These allocations can be solved by using static allocations, with tuples:
julia> function meandist(n)
d = 0.
for i in 1:n
x = (rand(),rand())
y = (rand(),rand())
d += sqrt((x[1]-y[1])^2+(x[2]-y[2])^2)
end
return d/n
end
meandist (generic function with 1 method)
julia> @btime meandist(1000)
13.012 μs (0 allocations: 0 bytes)
0.5122565299210201
or with the StaticArrays package:
julia> using StaticArrays
julia> function meandist(n)
d = 0.
for i in 1:n
x = SVector{2,Float64}(rand(),rand())
y = SVector{2,Float64}(rand(),rand())
d += sqrt((x[1]-y[1])^2+(x[2]-y[2])^2)
end
return d/n
end
meandist (generic function with 1 method)
julia> @btime meandist(1000)
13.027 μs (0 allocations: 0 bytes)
0.5096925569126717
These options are not as fast as the one that uses the Distances
package because there the computation of the distances is loop-vectorized. We can do that “by hand” (edit: without manually using @avx
the performance is the same - I guess loop vectorization occurs anyway, automatically):
julia> using LoopVectorization
julia> function meandist(n)
d = 0.
x = rand(2,n)
y = rand(2,n)
@avx for i in 1:n
d += sqrt((x[1,i]-y[1,i])^2+(x[2,i]-y[2,i])^2)
end
return d/n
end
meandist (generic function with 1 method)
julia> @btime meandist(1000)
5.006 μs (2 allocations: 31.50 KiB)
0.5336431684976665
and, of course, lot of that time is just the generation of the two matrices:
julia> function meandist(n,x,y)
d = 0.
@avx for i in 1:n
d += sqrt((x[1,i]-y[1,i])^2+(x[2,i]-y[2,i])^2)
end
return d/n
end
meandist (generic function with 2 methods)
julia> x = rand(2,1000); y = rand(2,1000);
julia> @btime meandist(1000,$x,$y)
1.530 μs (0 allocations: 0 bytes)
0.5345441925830112
Edit: Without generating the list of random points inside the function, the option with static arrays is as fast as the loop-vectorized (perhaps it gets loop-vectorized by the compiler automatically):
julia> function meandist(n,x,y)
d = 0.
for i in 1:n
d += sqrt((x[i][1]-y[i][1])^2+(x[i][2]-y[i][2])^2)
end
return d/n
end
meandist (generic function with 2 methods)
julia> x = rand(SVector{2,Float64},1000)
julia> y = rand(SVector{2,Float64},1000)
julia> @btime meandist(1000,$x,$y)
1.508 μs (0 allocations: 0 bytes)
0.5239587680180073
The equivalent version without static arrays is slower, but not bad:
julia> x = [ rand(2) for i in 1:1000 ];
julia> y = [ rand(2) for i in 1:1000 ];
julia> @btime meandist(1000,$x,$y)
3.292 μs (0 allocations: 0 bytes)
0.5311847109064113
Edit 2:
Given that allocating the vectors outside the function gives similar results for static arrays or matrices, the difference is in the time required to generate those. There are important differences there, indeed:
julia> @btime [ SVector{2,Float64}(rand(),rand()) for i in 1:1000 ];
12.237 μs (1 allocation: 15.75 KiB)
julia> @btime rand(SVector{2,Float64},1000);
3.195 μs (1 allocation: 15.75 KiB)
julia> @btime rand(2,1000);
1.605 μs (1 allocation: 15.75 KiB)