Large memory allocation in function within function

Here is how I’d implement it

# Same speed as original but output is inferred correctly
function fun1!(output::Matrix, mystruct::Supertype, observations)
    for i in eachindex(mystruct.distr)
        output[:, i] .= logpdf.(mystruct.distr[i], vec(observations))
    end
    return output
end
fun1(mystruct::Supertype, observations) = fun1!(zeros(length(observations), length(mystruct.distr)), mystruct, observations)

# 23ms compared with 43ms original or 240ms for fun3
function fun2(mystruct::Supertype, observations)
    log_evidence = fun1(mystruct, observations)
    alpha = zero(log_evidence)
    for t in eachindex(log_evidence)
        alpha[t] = log_evidence[t]
    end
    return alpha
end

A few things to note - it appears that fun1 is actually easy to get wrong. I naively assumed that using a double for-loop over observations and distributions would be fastest but it is actually considerably slower than the broadcasted version shown (by a factor of 10!). I’m not quite sure why this (is Distributions doing something strange?). The version here is the same speed as yours but the output is correctly inferred and so there is no speed loss in fun2. Note that a common style in Julia is to have lots of simple functions composed together in this way which tends to help inference.

Another thing is that your use of 1:size(log_evidence)[1] isn’t very performant. Better is 1:size(log_evidence, 1). (Again, @code_warntype reveals this.)

As you learn the typical style of Julia code, you’ll find that you are automatically writing fairly performant code without many type annotations; a sprinkling of @benchmark and @code_warntype and you’ll spot the remaining problems. It does take a bit of time to learn though but the rewards are good.

2 Likes