Mutating arrays inside function

Hello,

I am new to Julia and just trying out things and ran into something that I found unintuitive and wanted to understand the canonical way.

Basically, I am trying to pass in an array to a function and have the function modify the input array and not create any temporary objects. Here is a MWE:

function foo!(x, y)
    x = 2*x + y
end

function bar!(x, y)
    x[:] = 2*x + y
end

x = ones(10)
y = ones(10)

foo!(x,y)  # does not update x 

fill!(x, 1)
fill!(y, 1)

bar!(x,y) # updates x

function bar does what I want, but function foo does not. From reading around in Julia documentation, this seems to be expected behaviour that foo does not mutate the input, the assigned x (lhs) in foo is just a local temporary, which I find un-intuitive.

I know that I can do x = foo!(x,y) to get the correct behaviour, but IMO it seems unintuitive that you pass in an array to a function, but have to assign to x again.

If someone can maybe clarify this behaviour and the reason behind it, that would be helpful.

Thanks!

Hi @pratikvn, so the thing to understand here is that foo does not do any mutation, it is just re-binding a local variable.

In julia, variables are just labels that point to a piece of data. In this case, x is labelling an array, and when you write x = 2*x + y, you are saying something like "okay, create a new array 2x + y, and then relabel x such that the label x is used for that array instead. No underlying data is mutated.

Mutation in julia refers to changing the underlying data, rather than the label. So your bar function is instead creating the array 2*x + y and then writing the contents of that array into the array that x labels. No labels are changed.

Does that help clear up the confusion?


As for how to do what you want here, probably the best way to write this function would be to write

function baz!(x, y)
    x .= 2 .* x .+ y
end

which is basically just fancy syntax for a loop

for i in eachindex(x, y)
     x[i] = 2 * x[i] + y[i]
end

The nice thing about this is that there is no temporary intermediate array created, so it’s a lot faster.

9 Likes

As a side note, those docs need some working. Who uses explicitly the broadcast function? That should be the side note. I cannot work on it this week, but if nobody takes a shot, I’ll do it later.

1 Like

I assume this is just a simple example on the part of the OP, but in this particular case it is better to define the function like this

function baz(x, y) 
    return 2 * x + y
end

and then use dots at the call site, like this

x .= baz.(x, y) 
1 Like

Hi @Mason , thanks for the clarification. I understand that foo just rebinds x to a new local variable, but I found it a bit confusing coming from other languages.

Would you say that the canonical way in Julia to mutate the input array arguments (modify underlying data without creating a temporary array) to a function would be to use the . broadcast way ?

Is there a way to do that for more complex operations, for example custom matrix-vector products ?

function matvec!(x, A)
      x[:] = x + A*x
end

The matvec function as I understand it, would create a temporary array and then copy that over to x.

Is there a nice way to express this with the . or broadcast operation ?

I should maybe also clarify that this is a simplification of the actual function I am trying to write, and the actual function might be doing more than just a matrix vector product, for example (ignore the usefulness of the computation of func, this is just for the example),

function func(x, A)
      y .= sin.(A)
      while true
            x[:] = x - A*diag(y)
            if norm(x - y) < 1e-5
                  break
            end
      end
end

Yeah, it’s an important thing to understand because we treat variable bindings very different from mutation. e.g.

julia> x = 1
1

julia> ismutable(x)
false

julia> x = 2
2

Really depends on what you’re doing. . syntax is nice, but I guess the most fundamental object is a loop.

If you want to do a matrix-vector product you should use the function mul!(x, A, x, 1.0, 1.0) which does a fused muladd. operation in place.

2 Likes

As far as I understand, this is what most languages do.

There’s very few languages that allow assigning to an argument to affect the binding in the caller. I believe Fortran allows this and C++ “pass by reference” arguments do this. But yeah, it’s unusual and generally considers a bad idea.

On the other hand it’s also standard that if you pass a reference to an object into a function and modify that object that the modification is visible to all code that has a reference to that object. (This is almost tautological since it serves as a semantic definition of what it means for the reference to be to the same object in the first place.) Some numerical languages like Matlab and R deviate from this specifically for array arguments by doing something called “copy on write” where they create a new reference object to the same array memory when an array object is passed into a function call, and if any modification is made they then copy that point. This is not good for writing really high performance for though since argument passing is not free and it’s hard to avoid accidental copies and you can’t write functions that perform visible mutation. So that’s again quite unusual.

Bottom line: what Julia does here is absolutely bog standard and common. It’s exactly what mainstream languages with objects like Lisp, Python and Java all do.

5 Likes

Fortran requires this because it doesn’t have the syntax for returning more than one value from a function.

I don’t think mul! accepts an aliased destination (may lead to incorrect results).

3 Likes

The default behaviour for most languages that I have worked with, is to not locally create another variable label when there is a variable of the same label in the scope. That is what I found unintuitive.

Maybe a more concrete example of the code Ithat I want to run:

     0   -     function solve!(x, A, b, omega, tolerance, maxiter)
     1   0         validate(A, b, x)
     2  144        r = b - A*x
     3   -         iter = 0
     4   0         while true
     5   0             iter = iter + 1
     6   0             x .= x .+ omega .* r
     7 5184            r = b - A * x
     8   0             if ((norm(r) <= tolerance) || (iter >= maxiter))
     9   -                 break
    10   -             end
    11   -          end
    12   0          return iter
    13   -     end

which does a matrix vector product in a loop. I ran the profiler with vectors of size 10. On the left column, you see the allocation in bytes.

I understand the first allocation (line 2), but I would like to to remove any allocations within the loop, particularly one on line 7. What I believe is happening is that in the loop, a vector is being allocated every loop iteration which I dont think should be necessary.

Is there a way to express this such that there are no allocations ? I havent tested if using some library to do the residual computation would get rid of them, but using a custom loop for example as below, did not help (I still see allocations in line 2 below)

0  function compute_residual!(r, A, b, x)
1     for row in 1:size(b, 1)
2	     r[row, 1] = b[row, 1] - (A[row, :]' * x)[1]
3     end
4  end

The line r = b - A*x allocates in several places:

  1. A*x allocates a vector for the result of the matrix multiplication
  2. b - ... allocates a vector for the difference
  3. r = ... then just names that vector, i.e., does not allocate itself (assigning via = never does)

You can fuse all operations into an in-place loop as follows:

r .= b .- dot.(eachrow(A), Ref(x))
# essentially equivalent to the following loop
for i in eachindex(r)
    r[i] = b[i] - dot(@view(A[i, :]), x)
end

Note that A[i, :] allocates, i.e., you need an explicit view to prevent this.

I’m not aware of any languages where that second assignment would modify the existing array. The same memory might be reused as an optimization if there is no other reference to the array, but that’s strictly an optimization. I’d be very interested in examples of where that’s how a language works.

2 Likes

@bertschi’s answer works but does not use BLAS which will be more efficient for large matrices. I think a more canonical way (that’s also easier to parse for a reader) is to use mul! like this:

r .= b
mul!(b, A, x, -1, 1) # performs x[:] = x-Ax without allocation
1 Like