Note that the performance analysis in the original link does not apply to Julia.
In R the vectorized code is much faster, because loops are slow. In Julia the loop is fast, as fast as the broadcasted x .= a .+ b
, vectorized, version.
In Julia you will pay a huge price if you allocate a new x
at every iteration, which is happening in the “vectorized” version of the original post.
Finally, those allocations are characteristic of mutable arrays. If the arrays where immutable, then the original “vectorized” code would be fine, that is:
using StaticArrays
function vectorized()
a = SVector(1.0, 1.0)
b = SVector(2.0, 2.0)
for i in 1:1000000
x = a + b
end
return x
end
a new x
is created at every iteration of the loop, but not in the “heap”, such that it is fast. This can be used for small arrays. See also: Immutable variables · JuliaNotes.jl.
ps: I see that the post continues with an analysis of Julia versions. Yet, the post is confusing, in my opinion, and there is a problem in general in the comparison of the functions because neither one does nothing (all of them return “nothing”), so that measuring the time of these functions is quite prone to benchmarking artifacts. It is better to benchmark functions that actually return something that can be compared.
This is a better comparison:
julia> function v1()
a = [1.0, 1.0]
b = [2.0, 2.0]
x = [0.0, 0.0]
for i in 1:10^6
x = x + a + b
end
return x
end
v1 (generic function with 1 method)
julia> @btime v1()
35.474 ms (1000003 allocations: 76.29 MiB)
2-element Vector{Float64}:
3.0e6
3.0e6
julia> function v2()
a = [1.0, 1.0]
b = [2.0, 2.0]
x = [0.0, 0.0]
for i in 1:10^6
x .= x .+ a .+ b
end
return x
end
v2 (generic function with 1 method)
julia> @btime v2()
5.339 ms (3 allocations: 240 bytes)
2-element Vector{Float64}:
3.0e6
3.0e6
julia> function v3()
a = [1.0, 1.0]
b = [2.0, 2.0]
x = [0.0, 0.0]
for i in 1:10^6
for index in eachindex(x,a,b)
@inbounds x[index] = x[index] + a[index] + b[index]
end
end
return x
end
v3 (generic function with 1 method)
julia> @btime v3()
3.258 ms (3 allocations: 240 bytes)
2-element Vector{Float64}:
3.0e6
3.0e6
julia> using StaticArrays
julia> function v4()
a = SVector(1.0, 1.0)
b = SVector(2.0, 2.0)
x = SVector(0.0, 0.0)
for i in 1:10^6
x = x + a + b
end
return x
end
v4 (generic function with 1 method)
julia> @btime v4()
2.005 ms (0 allocations: 0 bytes)
2-element SVector{2, Float64} with indices SOneTo(2):
3.0e6
3.0e6