# How to remove extra allocation when doing point-wise assignment?

Hello, I found that if exists assignment, the point-wise loop will introduce lots of additional allocation (on Julia 1.0),

``````julia> function f1(a, b, c)
@. t1 = a * b / c
end
f1 (generic function with 1 method)

julia> function f2(a, b, c)
for j = 1: size(a, 2), i = 1: size(a, 1)
t1[i,j] = a[i,j] * b[i,j] / c[i,j]
end
end
f2 (generic function with 1 method)

julia> t1, x1, x2, x3 = [rand(10000, 5000) for _ in 1: 4];

julia> f1(x1, x2, x3); f2(x1, x2, x3);

julia> @time f1(x1, x2, x3);
0.095687 seconds (8 allocations: 256 bytes)

julia> @time f2(x1, x2, x3);
2.002025 seconds (142.34 M allocations: 2.121 GiB, 3.33% gc time)

``````

As in comparison, the point-wise loop beats the broadcast fusion if no assignment:

``````julia> function g1(a, b, c)
@. a * b / c
end
g1 (generic function with 1 method)

julia> function g2(a, b, c)
for j = 1: size(a, 2), i = 1: size(a, 1)
a[i,j] * b[i,j] / c[i,j]
end
end
g2 (generic function with 1 method)

julia> g1(x1, x2, x3); g2(x1, x2, x3);

julia> @time g1(x1, x2, x3);
0.299063 seconds (6 allocations: 381.470 MiB, 9.51% gc time)

julia> @time g2(x1, x2, x3);
0.033901 seconds (4 allocations: 160 bytes)
``````

Is there anything I am doing improperly? I may need the point-wise loop for implementing SharedArray-based parallelism. Thank you!

You should generally avoid modifying a global variable (your `t1`) within a function. Also, it’s better to use BenchmarkTools.jl’s `@btime` to benchmark your functions performance.

4 Likes

Avoid that global `t1` as pointed out by @carstenbauer, and they are essentially equal:

``````using BenchmarkTools

function f1(t1, a, b, c)
@. t1 = a * b / c
end

function f2(t1, a, b, c)
for i in eachindex(a)
t1[i] = a[i]b[i] / c[i]
end
end

function ff()
t1, x1, x2, x3 = [rand(10000,5000) for _ in 1:4]
@btime f1(\$t1, \$x1, \$x2, \$x3)
@btime f2(\$t1, \$x1, \$x2, \$x3)
end
``````

And

``````ff()

100.161 ms (0 allocations: 0 bytes)
100.087 ms (0 allocations: 0 bytes)

``````
1 Like