It is common to reduce (e.g. sum) the result of an array expression without needing the whole resulting array itself. In this case Julia will still create a temporary array to pass to the function, even if you managed to avoid temporary arrays before by taking advantage of loop fusion. Is there some way to skip that for reduction somehow? Perhaps some macro or specialized function?
If the reduced expression has only one array argument, one can use
mapreduce(), but since it only accepts one iterator, for an expression with several arrays
zip() is required which degrades the performance (it also looks somewhat ugly without tuple destructuring). As an example:
function test_vectorized(x, y) sqrt(sum((x .- y) .^ 2)) end function test_devectorized(x, y) s = zero(eltype(x)) @inbounds @simd for i in 1:length(x) s += (x[i] - y[i])^2 end sqrt(s) end function test_mapreduce(x, y) sqrt(mapreduce(p -> (p-p)^2, +, zip(x, y))) end x = rand(100000000) y = rand(100000000) r1 = test_vectorized(x, y) @time test_vectorized(x, y) r2 = test_devectorized(x, y) @time test_devectorized(x, y) r3 = test_mapreduce(x, y) @time test_mapreduce(x, y) @assert isapprox(r1, r2) @assert isapprox(r1, r3)
0.575121 seconds (87 allocations: 762.946 MiB, 18.08% gc time) 0.094775 seconds (5 allocations: 176 bytes) 0.140150 seconds (9 allocations: 272 bytes)
Is there a better way to do this kind of calculation without resorting to devectorization?