That is a benchmarking artifact, you are benchmarking the creation of the arrays. If you interpolate them, it has only one allocation:
julia> @btime foo($([1.0, 1.0, 1.0]), $([2.0, 2.0, 2.0]))
32.036 ns (1 allocation: 112 bytes)
3-element Vector{Float64}:
-3.0
-3.0
-3.0
If you use static arrays, you can go to zero allocations:
julia> using StaticArrays
julia> function foo(x::AbstractVector{Float64}, y::AbstractVector{Float64})
(x .- y) .* 3.0
end
foo (generic function with 2 methods)
julia> @btime foo($(SVector{3,Float64}(1.0, 1.0, 1.0)), $(SVector{3,Float64}(2.0, 2.0, 2.0)))
0.015 ns (0 allocations: 0 bytes)
3-element SVector{3, Float64} with indices SOneTo(3):
-3.0
-3.0
-3.0
(which can be very useful if your problem deals with small arrays, as in the example).