# Memory allocation within loop

I have a `for` loop in which I create a function that I optimize, something like:

``````using Interpolations, Optim
function PolicyFunction()
V = Array{Float64}(undef, 201, 40)
n = Array{Float64}(undef, 201, 40)

grid_h = range(10.0, stop = 300, length = 201)

for j = 40:-1:1
if j == 40 # Last period
nextV = zeros(201)
else
nextV = V[:,j+1]
end

# Linerar interpolation for this period
MyItp = interpolate((grid_h,), nextV, Gridded(Linear()))
V_tw = extrapolate(MyItp, Line())
for (ind_h, h) in enumerate(grid_h)
# Define aux function to maximise
aux_f(n_td) = log(1-n_td) + (0.95*V_tw(n_td^2.0))
result = optimize(aux_f, 0.01, 0.99, GoldenSection())

Optim.converged(result) || error("Failed to converge")
n[ind_h, j] = Optim.minimizer(result)
V[ind_h, j] = Optim.minimum(result)
end
end
return n
end
``````

With this implementation I get:

``````@time PolicyFunction()
0.106507 seconds (1.68 M allocations: 32.176 MiB, 3.05% gc time)
``````

I’m happy with the time but I was surprised with the memory allocation (in my real application I get something like `148.13 M allocations: 2.238 GiB, 2.08% gc time`. When I use a more rudimentary solution like:

``````function PolicyFunction()
V = Array{Float64}(undef, 201, 40)
n = Array{Float64}(undef, 201, 40)

grid_h = range(10.0, stop = 300, length = 201)

for j = 40:-1:1
# Define vector for next period
if j == 40 # Last period
nextV = zeros(201)
else
nextV = V[:,j+1]
end

# Linerar interpolation for this period
MyItp = interpolate((grid_h,), nextV, Gridded(Linear()))
V_tw = extrapolate(MyItp, Line())

for (ind_h, h) in enumerate(grid_h)
V_max = - Inf
for n_td in range(0.01, stop = 0.99, length = 201)
V_td = log(1-n_td) + (0.95*V_tw(n_td^2.0))

if V_td > V_max
V_max = V_td
n[ind_h, j] = n_td
V[ind_h, j] = V_td
end
end
end
end
return n
end
``````

The time increases (of course depends on the grid points for `n_td`) but the use of memory is much lower (in my real application the difference is even more dramatic):

``````0.195065 seconds (150.49 k allocations: 7.371 MiB, 1.22% gc time)
``````

Can anybody explain why is there such a difference in memory allocation? Is it because I create a function in each iteration? Is there a way around it?

Optim probably allocates some work buffers in `optimize`.

If you are happy with the time, do you really care about how many allocations that occurred?

1 Like

I get similar performance when I use my own implementation of the golden section.

I was thinking that this high use of memory might mask a problem somewhere or inefficient coding…

Edit: This function is part of a bigger iteration and ends up being called many times.

Yeah, that is sometimes the case. The only way to really know is to do some profiling and try to see where time is spent. And then you could try optimize that and see if reducing allocations help with performance.

1 Like

Any tutorial on how to do profiling?

Thanks!

There are a few profiling options, depending on your feelings about GUIs. Julia’s built-in profiling tools will print a text-based function call tree annotated with the number of times each function was called:

``````using Profile
Profile.clear()
@profile PolicyFunction()
Profile.print()
``````

ProfileView gives you a graphical view of the tree, and Juno’s built-in @profiler macro (my preferred method) gives a view of the tree as well as a histogram of call frequency overlaid on the lines of your script.

1 Like

Optim creates a new ResultSet for every optimization that you do, it does not reuse it. This leads to large memory allocations when your optimization is in your inner loop like this.

However! From what I remember, I did hack together my own univariate optimizer exactly modeled off the one in the package but that didn’t create structs for every run. Memory allocations decreased, but the performance was really not much different (again, just my recollection). I think they just get garbage collected fairly efficiently or something.

1 Like

Hi @drarnau, I’m also trying to code up a lifecycle model in Julia, which looks very similar to the example you posted. Did you end up finding a way to reduce allocations/speed up the optimization?