Small functions - best practices

Hi all. First-time poster here.

I am writing some scientific code and would like to avoid unnecessary allocations. Specifically, I have a bunch of small functions (essentially one-liners) manipulating Arrays (representing tensors). This is all inner-loop stuff and I want it to be as lean as possible. As a super-simplified example:

linear(x::Vector{Float64}) = x'*W .+ b

My questions are:

  1. From a performance standpoint, does it make sense to define
linear!(x::Vector{Float64}, res::Vector{Float64}) = (res .=x'*W .+ b)

or can I count on the compiler to save me from copy/pasting code by inlining the function or performing some type of return value optimization? Or, does it make sense to @inline this type of functions myself?

  1. What are useful macros/tools for checking this kind of stuff? @code_llvm? Some thing from BenchmarkTools.jl? There’s a bunch of stuff out there but as a relative beginner and having no experience with reading lower-level code, I don’t know what/where to look for.

Thanks!

1 Like

if you read what the ?@inline says:

Give a hint to the compiler that this function is worth inlining.

Small functions typically do not need the @inline annotation, as the compiler does it automatically.

1 Like

The second will be slightly faster, but it is probably worth noting that linear!(x, res) = (res .=x'*W .+ b) will have exactly the same performance. Also, if W and b are non-const global variables, that will absolutely kill performance.

2 Likes

To perform multiplication inplace and avoid an unnecessary copy, you need to call mul!, broadcasting .= is unfortunately not enough.

4 Likes

Didn’t know about mul!, that’s neat, thanks.

The fact that broadcasting is not enough wasn’t obvious to me. And that’s why I added the second question in the original post - what’s the easiest way to catch this kind of stuff?

1 Like

* is a binary operator that performs the multiplication, its a regular function call. Only dotted operations fuse under broadcast. I’m afraid I don’t know any other method of catching such allocations other than measuring them explicitly using, e.g. @time.
You can also use Meta.@lower to see what an expression lowers to, I have a feeling it might reveal the temporary array, if not, @code_typed might

4 Likes