Pre-allocated return arguments


#1

As a total newbie I wonder about the following issue concerning pre-allocated return arguments. Imagine I have a function which is called a huge number of times which takes as input a vector and a scalar, manipulates them and returns the modified vector and scalar to the caller. This can be done without unnecessary memory allocation as:

function rettest!(v::Array{Float64,1},s::Array{Float64,0})
   for i=1:100
      v[i]=v[i]*2.0
   end
   s[1]=s[1]*2.0
   nothing
end

function caller1()
   v=ones(Float64,100)
   s=ones(Float64)
   for  k=1:10 #in reality 1000000000 instead of 10
        rettest!(v,s)
   end
end

This works fine, in the sense that no allocation takes place neither in rettest! nor in calltest1 (apart obviously when v and s are first declared) but in a more complex scenario rettest! can have a lot of input parameters and only a few of them are actually modified so that looking at function caller it is not possible to understand what is being changed by rettest!. Another possibility is:

function rettest(v::Array{Float64,1},s::Array{Float64,0})
   for i=1:100
      v[i]=v[i]*2.0
   end
   s[1]=s[1]*2.0
   return v,s
end

function caller2()
     v=ones(Float64,100)
     s=ones(Float64)
     for  k=1:10 #in reality 1000000000 instead of 10
        (v,s)=rettest(v,s)
      end
end

Now what is being changed by rettest has become explicit and no allocation takes place in caller2, but the return statement in rettest allocates. The question is: can the best of both worlds be obtained by using macros or other means? Ideally I would like the memory-cheap behaviour of the first approach and the nice expressive syntax of the second. Can the compiler be smart enough to understand that what is meant by (x,y)=f(x,y,a,b,c) ?

(in f90 one can use intent(in) / intent (out) / intent (inout) to clearly identify what happens to subroutine arguments without having to look at the subroutine’s contents, and now you know which language I normally use…)


#2

#3

In full generality, I don’t think you can do this in julia.

In your example, both versions of rettest do the same thing, as evidenced by

function caller3()
v=ones(Float64,100)
s=ones(Float64)
for k=1:10 #in reality 1000000000 instead of 10
rettest(v,s)
end
@show v[1]
nothing
end

caller3()
#v[1] = 1024.0

The extra allocations are very small: They allocate space for the tuple of pointers (to v and s).

It is an eternal gripe of mine that and multiple return-values containing gc-controlled pointers (aka mutable objects) are currently heap-allocating in julia. See e.g.
https://discourse.julialang.org/t/immutables-with-reference-fields-why-boxed/7706/21. The general strategy is that you should try to avoid (in inner loops) to create tuples or nontrivial immutables containing references to mutable objects; only exception is that wrappers around mutable objects are OK (wrapper means: The mutable object is the only field).

Otoh, the heap-allocation for these small objects is really fast, and sometimes the compiler can optimize it out entirely.

In your specific example, you could use broadcasting.

For single-element arrays, you could also use Ref.

If you need to return multiple values, at least one of which is mutable, then the non-allocating way is to pass Ref{}s for the scalars and write to them. This looks really ugly, so the general consensus is that you take the performance hit for the allocation of the tuple, or play with @inline until the compiler decides that the alloc can be skipped; unless this really is an inner loop.


#5

Fixed the backticks issue. Thanks!


#6

This is the key point. @Pier: if you just don’t use them (eg just call rettest(v, s) and ignore returned values), I think that even that is elided.

But perhaps it would be better to call rettest rettest! instead, since it does modify its arguments.