# Memory allocation and usage of dot notation

The article Performance Tips advises to pay attention to memory allocation when trying to evaluate or improve the efficiency of a program. This made me wonder whether the estimated number of allocations or the estimated memory consumption is the more meaningful quantity in this respect (or if this is a meaningless question and they should always be considered jointly).

In trying to find out, I produced the following result.

``````using BenchmarkTools

function fun(x)
return @. 2*x^5 + 3*x^2 + sqrt(x) + 10.0 + x^1/5
end

x = rand(Float64, 100_000)

@btime fun.(x);
# 723.400 ÎĽs (5 allocations: 781.36 KiB)
@btime fun(x);
# 755.700 ÎĽs (2 allocations: 781.30 KiB)
``````

If I have used `@btime` correctly, this seems to indicate that the second function call with a smaller estimated number of allocations is slower than the first, which I find rather counter-intuitive. This would imply that in some cases the number of assignments is of no concern when evaluating the speed of a function. Does anyone have a good explanation of why this is the case in this particular example or general advice on what quantities I should focus on when benchmarking using `@btime`?

`x` is a non-`const` global variable here. That can often influence benchmarking, though it is somewhat unpredictable when it will happen. Try interpolating the variable with `\$`:

``````julia> @btime fun(\$x);
641.000 ÎĽs (2 allocations: 781.30 KiB)

julia> @btime fun.(\$x);
518.600 ÎĽs (2 allocations: 781.30 KiB)
``````

The difference in allocations disappeared, but there is a clear performance difference, I donâ€™t know why.

1 Like

Thanks for the advise! Good to at least see the difference in allocations disappear.

The ideal is 0 allocations, so it doesnâ€™t matter to know (or some low fixed number, when you see e.g. 2, then it may be artifact of using in the REPL, and will go to 0).

Sometimes type-instability is ok, thus allocations, and not all code is speed-critical. I would look at the timings first to see if I need to worry (for a realistic-sized workload). If you want the fastest code, and itâ€™s often not easy to know the performance ceiling of code, if you have any allocations then itâ€™s a good indicator you havenâ€™t reacted it.

This parses as `(x^1)/5`. Iâ€™m guessing that you intended `x^(1/5)`.

1 Like

â€¦apart from interpolating (pointed out, already)â€¦

``````julia> @btime fun(\$x)
690.700 ÎĽs (2 allocations: 781.30 KiB)
100000-element Vector{Float64}:
``````

â€¦I would also always consider, whether you really still need the input (original vector x, in this case), or if you â€śonlyâ€ť need the results, to continue with whatever youâ€™re doing, after calling that function. If only the result is needed, Iâ€™d reuse that already existing memory, like so (disclaimer: Iâ€™m new to julia, so maybe thereâ€™s a more elegant way):

``````function fun2!(x::Vector{Float64})
@. x .= 2*x^5 + 3*x^2 + sqrt(x) + 10.0 + x^1/5
end
``````
``````julia> @btime fun2!(\$x)
552.700 ÎĽs (0 allocations: 0 bytes)
100000-element Vector{Float64}:
``````

â€¦it may only be some 20% performance-improvement, in this case, but can be more dramatic in other cases, when youâ€™re operating on larger data, I think.

And while the number of allocations might hint at where memory is being allocated (i.e. how often), I would pay more attention to the amount. In this case, the original version allocated 781.3 KiB = 781.3 x 1024 Bytes, which is exactly 800.000 Bytes, which tells you that it allocated exactly the amount of memory, needed to store the result of the fct. for 100.000 x `Float64` (8B, each). Why it is saying â€ś2 allocationsâ€ť and not just one, I have no idea, though.

1 Like

Just for the records, usually the idiomatic way to do this in Julia is to write a scalar function, and broadcast it, in place, or not:

``````julia> f(x) = 2*x^5 + 3*x^2 + sqrt(x) + 10.0 + x^(1/5)
f (generic function with 1 method)

julia> x = rand(10^5);

julia> @btime \$x .= f.(\$x);
588.600 ÎĽs (0 allocations: 0 bytes)

julia> @btime y = f.(\$x);
612.474 ÎĽs (2 allocations: 781.30 KiB)
``````
4 Likes