Strategies to reuse memory

Dharik_Arsath · January 17, 2022, 9:11am

I’ve a preallocated vector temp which has the values of (y - y_pred ).^2 where y and y_pred are 1D vectors.

I’ve to reuse that vector in 2 or more different functions, what is the best solution for this. Note: performance matters✏️

eg:

const temp = Vector{Float64}(undef, size(x, 1))

function mse(x::Vector{Float64}, y::Vector{Int}, w::Float64, b::Float64)

for ind in eachindex(y,y_pred)
    temp[ind] = (y[ind] - y_pred[ind]) ^  2 # (y - y_pred)^2
end

end

function batchgd(x::Vector{Float64},y::Vector{Int},w::Float64,b::Float64, lr::Float64, threshold = 0.00001, max_iter = 10000)

for ind in eachindex(y, y_pred)
        temp[ind] = sqrt(y[ind] - y_pred[ind]) # since we have squared value we do sqrt.(reusing memory)
end

end

Here I reuse memory temp with minor change i.e) already I have squared value in temp and so in the function batchgd I do sqrt. so my question is what is the best way to reuse memory.

Creating a vector globally and calling in wherever required
computing y - y_pred everytime within each function where needed
returning the vector from one function and using them in another.
is there any better way.

mauro3 · January 17, 2022, 9:25am

Also, temp in your example is a non-const global, which is really bad for performance. See the “Performance tips” in the docs.

Dharik_Arsath · January 17, 2022, 9:54am

I have changed my code now. I guess it is good enough now. Thanks

pixel · January 17, 2022, 10:14am

for me its an interesting question - I have many algorithms that need to set something up (allocate memory, pre-calculate some parameters) and then an actual function that uses them to do the main part of the work. So what is the best way to pass the allocated/pre-calculated parameters to the actual function? Sure, passing them in as parameters, or passing a structure containing them works fine. But then I have a dozen or so algorithms so I end up passing a large number of parameters, which gets messy.

In other languages I would just use a global variable or a common block for the precalculated stuff which means a lot of complexity vanishes, But that’s not good for paralellisation and not the Julia way. Not a big issue, I just find passing a lot of parameters around to be a bit mechanical and distracting from the actual problem solving.

Henrique_Becker · January 17, 2022, 12:30pm

The only “Julian” improvement I may see is maybe use NamedTuples, so each function has at most one parameter that has everything that needs to be precomputed/preallocated without having to create multiple distinct structs depending on exactly which objects each function takes.

lmiq · January 17, 2022, 12:49pm

An interesting pattern for this is to define the function with keyword parameters defining the auxiliary arrays (I will use a tuple here as suggested above):

julia> function f!(x;aux=(zeros(3),zeros(3)))
           a, b = aux
           a .= 2 .* x
           b .= x .^ 2
           x .= x .+ a .* b
           return x
       end
f (generic function with 1 method)

julia> x = rand(3);

julia> @btime f!(x) setup=(x=copy(xin)) evals=1
  135.000 ns (2 allocations: 160 bytes)
3-element Vector{Float64}:
 0.9603701960196425
 1.0196162985304351
 2.727934952974922

julia> auxin = (zeros(3),zeros(3));

julia> @btime f!(x,aux=aux) setup=(x=copy(xin);aux=deepcopy(auxin)) evals=1
  66.000 ns (0 allocations: 0 bytes)
3-element Vector{Float64}:
 0.9603701960196425
 1.0196162985304351
 2.727934952974922

The nice thing of this pattern is that you can develop the code without worrying about these allocations and then add them as a performance optimization where needed. Thus the complexity appears only where really needed.

It is possible, of course, to create constant global variables to act as a buffer, but that will break modularity and at the end increase code maintenance complexity.

Concerning this point, I think one useful pattern is to define two functions, taking the advantage of multiple dispatch, one that receives preallocated vectors and return them, and the other that does not receive them and does not return them. Then you can again only take of care preallocations when needed, as an optimization:

julia> function f!(x,aux)
           a, b = aux
           x .= x .+ a .* b
           return x, aux
       end
f! (generic function with 2 methods)

julia> function f!(x)
           aux = (zeros(3),zeros(3))
           f!(x,aux)
           return x
       end
f! (generic function with 2 methods)

julia> x = rand(3); aux = (zeros(3),zeros(3))
([0.0, 0.0, 0.0], [0.0, 0.0, 0.0])

julia> x, aux = f!(x,aux)
([0.01274472197573262, 0.9823930168953983, 0.48667397158434544], ([0.0, 0.0, 0.0], [0.0, 0.0, 0.0]))

julia> x = f!(x)
3-element Vector{Float64}:
 0.01274472197573262
 0.9823930168953983
 0.48667397158434544

(ps: since I wrote everything mutating in place, returning the variables is optional here, in these examples the functions could well return nothing instead)

Dharik_Arsath · February 8, 2022, 11:25am

My next question arises that why julia is following this convention of function_name followed by “!”. why not pass parameter like.

function sum_of_arr(arr,out=nothing)
if out === nothing
create new array
else # meaning some preallocated array is passed
sum all and store in out array.

why julia community not following this way. The problem I see is that I have to write functions 2 times with ! and without !. which makes code longer.

lmiq · February 8, 2022, 12:02pm

Not all functions have something to be mutable or not, some functions have more than one mutable input, some functions have optional mutables, that could not be a single standard notation. The ! is a warning that something is being mutated there.

You of course don’t have to write twice the same code, you just have to define a function that calls the other, as:

julia> function f!(x,y) # mutates
           y[1] = 2 * x[1] # true function body, complex code
           return y
       end
f! (generic function with 1 method)

julia> function f(x) # does not mutate
           y = zero(x) # allocate
           return f!(x,y) # just call the previous one
       end
f (generic function with 1 method)

Thus, the “real” function is the one that does the operation in place, and the one ~~with~~ without the ! just preallocates stuff and calls that one.

Benny · February 14, 2022, 11:41pm

function sum_of_arr(arr,out=nothing) actually defines two methods sum_of_arr(arr) and sum_of_arr(arr, out), so along with some compiler optimizations, it ends up doing the same thing as @lmiq’s f and f! methods. The “f uses f!” is really just a convention for quickly spotting a function call that may mutate any of its (mutable) inputs.

There are a couple problems with using an out convention. The smaller one is that a method can mutate many arguments, not just 1 out, but that can be fixed by naming arguments as out1/out2/etc or appending ! to more distinctive names. The bigger one is that you can’t write positional arguments with their names in function calls: function sum_of_arr(arr, out! = nothing) can be called as sum_of_arr(arr1, arr2), NOT sum_of_arr(arr1, out! = arr2). And really, function calls are where you would want to see if a variable is being mutated.

lawless-m · February 15, 2022, 10:17am

There are real reasons where specifying an out parameter does not fit the model.

#  just some random dataframe in case you want to try it
df = DataFrame(a=rand(1_000_000), b=rand(1_000_000), c=rand(1_000_000));
# this would *copy* the data, which might be Gb in size.
# and the writer of select must do a dance to see 
# if the in and out parameter are the same, 
# otherwise the data might be clobbered before it can be copied
select(df, Not(:c), out=df)
# this does *not* copy the data, it just drops the column
select!(df, Not(c:))

Topic		Replies	Views
How to preallocate and reuse buffers for oft-repeated computation? General Usage question , performance , memory-allocation , profiling , speed-optimization	15	1434	July 12, 2023
Memory Pre-allocation in the global scope New to Julia question	20	1171	August 18, 2022
Why does Julia allocate memory when I already pre-allocated? Performance	5	548	July 24, 2021
Memory allocations when returning vectors General Usage array , memory-allocation	15	1482	June 6, 2018
Memory best practices in Julia with Arrays vs Vectors New to Julia memory-allocation , arrays , staticarrays , simulations	19	763	August 18, 2024

Strategies to reuse memory

Related topics