# Reducing allocations (again)

I still don’t have great intuition for quickly spotting my bottlenecks due to unneeded allocations and the associated work around. Here is some code I’m trying to optimize right now:

``````function logpost(w,data,μ,Σ)
post = 0.0
#    rightWrong = data[:,1]
#    total = data[:,2]
#    xvec = [1.0,0.0]
for (idx,d) in enumerate(data[:,2])
#xvec = d  # = [d^x for x in [0,1] ]
#display(xvec)
#μ = log(1 + exp(-w' * xvec))  # This is playing the role of the likelihood.
if data[idx,1] < 1e-2
@timeit "calc post1" post -= log(exp(w' * [1.0,d]) + 1)
else
@timeit "calc post2" post += w' * [1.0,d] - log(exp(w' * [1.0,d]) + 1)
end
@timeit "calc post3" post += logPriorGaussian(w,μ,Σ)
end
return post
end

function getDraws(post::Function, data::Array{Float64},ndraws::Int64,candsig2::Array{Float64},μ::Array{Float64}, Σ )::Array{Float64}

w = Array{Float64}(undef,(ndraws,2))
w[1,:] = Float64[-8,10]
i = 2
draw = Float64
tempw = Float64[-8,10]
accepted = [0,0]
while i <= ndraws
for j in 1:2  # Loop over number of parameters that we are estimating.
w[i,j] = w[i-1,j]  # Set the next draw to be the previous one.  It will stay this way unless we decide to accept a new draw.

cand = Normal(w[i-1,j],sqrt(candsig2[j]))  # Center the envelope distribution on the current value of w
@timeit "Draw normal" draw = rand(cand,1)
@timeit "reassign" tempw[j] = w[i-1,j]
@timeit "denom" denom = post(tempw,data,μ,Σ)
tempw[j] = draw
@timeit "numerator" numerator = post(tempw,data,μ,Σ)
tempw[j] = w[i-1,j]
@timeit "calc" r = numerator - denom
unif = Uniform(0,1)
unifdraw = log.(rand(unif,1))
if r >= 1.0
w[i,j] = draw
tempw[j] = draw
accepted[j] += 1
elseif r < 1.0 && unifdraw < r
w[i,j] = draw
tempw[j] = draw
accepted[j] += 1
end
end
i += 1

end
println(accepted, "  ", ndraws)
println("Acceptance percentage:", accepted/ndraws, " ",accepted/ndraws)
return w
end

``````

(There are some functions that are called here that I haven’t included, but could if you want to see them.)

Thanks in advance for any insights

You’ll notice that I am repeatedly defining and drawing from statistical distributions and the `rand` function returns an array instead of a number. Can I make `rand` return a scalar instead of a vector? Would that help?

This copies the column of `data`, which will cause lots of unnecessary allocations. You want `@view data[:, 2]` most likely.

This constructs a brand new `Vector` just to hold `[1.0, d]`, then immediately reduces that down to a scalar via the dot product. Vector’s aren’t terribly expensive, but they’re not free and they always allocate memory. For small fixed-size arrays, you definitely want StaticArrays.jl. With StaticArrays, this could be:

``````w' * SVector(1.0, d)
``````

or, if you prefer, `w' * @SVector[1.0, d]`.

4 Likes

`rand(unif)`

3 Likes

Yes, and yes! `rand()` or `rand(distribution)` should return a scalar, and this is much cheaper than returning a 1-element vector.

3 Likes

Help me understand this. I know about SVectors, but in your modification, it seems that a new SVector will have to be created every time. How is that cheaper than a standard vector?

I see that you are peppering your code with `@timeit` already. I’d recommend using `@btime` from `BenchmarkTools` for sufficiently encapsulated methods instead (it will also show you allocations then). Did you ever consider using a profiler to get a high level picture of what your code is doing?

SVector usually gets its memory from the stack or even registers instead of the heap

2 Likes

Calls to this function account for nearly 80% of my allocations:

``````function logPriorGaussian(w,μ,Σ)::Float64
a = MvNormal(μ,Σ)
return log(pdf(a,w))
end
``````

The parameters to the MVN don’t change so I suppose I could define it once outside the main loop and pass it in instead of defining it every time. Other than that, any way to speed this up?

1 Like

If no statistician opposes (and stupid me doesn’t see a reason), go ahead.

If you let use see another MWE after this?

Instead of @timeit, a very nice tool for granular inspection of timings and allocations is this package: Timer Outputs