I am new to Julia and just trying out things and ran into something that I found unintuitive and wanted to understand the canonical way.
Basically, I am trying to pass in an array to a function and have the function modify the input array and not create any temporary objects. Here is a MWE:
function foo!(x, y)
x = 2*x + y
end
function bar!(x, y)
x[:] = 2*x + y
end
x = ones(10)
y = ones(10)
foo!(x,y) # does not update x
fill!(x, 1)
fill!(y, 1)
bar!(x,y) # updates x
function bar does what I want, but function foo does not. From reading around in Julia documentation, this seems to be expected behaviour that foo does not mutate the input, the assigned x (lhs) in foo is just a local temporary, which I find un-intuitive.
I know that I can do x = foo!(x,y) to get the correct behaviour, but IMO it seems unintuitive that you pass in an array to a function, but have to assign to x again.
If someone can maybe clarify this behaviour and the reason behind it, that would be helpful.
Hi @pratikvn, so the thing to understand here is that foo does not do any mutation, it is just re-binding a local variable.
In julia, variables are just labels that point to a piece of data. In this case, x is labelling an array, and when you write x = 2*x + y, you are saying something like "okay, create a new array 2x + y, and then relabel x such that the label x is used for that array instead. No underlying data is mutated.
Mutation in julia refers to changing the underlying data, rather than the label. So your bar function is instead creating the array 2*x + y and then writing the contents of that array into the array that x labels. No labels are changed.
Does that help clear up the confusion?
As for how to do what you want here, probably the best way to write this function would be to write
As a side note, those docs need some working. Who uses explicitly the broadcast function? That should be the side note. I cannot work on it this week, but if nobody takes a shot, I’ll do it later.
Hi @Mason , thanks for the clarification. I understand that foo just rebinds x to a new local variable, but I found it a bit confusing coming from other languages.
Would you say that the canonical way in Julia to mutate the input array arguments (modify underlying data without creating a temporary array) to a function would be to use the . broadcast way ?
Is there a way to do that for more complex operations, for example custom matrix-vector products ?
function matvec!(x, A)
x[:] = x + A*x
end
The matvec function as I understand it, would create a temporary array and then copy that over to x.
Is there a nice way to express this with the . or broadcast operation ?
I should maybe also clarify that this is a simplification of the actual function I am trying to write, and the actual function might be doing more than just a matrix vector product, for example (ignore the usefulness of the computation of func, this is just for the example),
function func(x, A)
y .= sin.(A)
while true
x[:] = x - A*diag(y)
if norm(x - y) < 1e-5
break
end
end
end
There’s very few languages that allow assigning to an argument to affect the binding in the caller. I believe Fortran allows this and C++ “pass by reference” arguments do this. But yeah, it’s unusual and generally considers a bad idea.
On the other hand it’s also standard that if you pass a reference to an object into a function and modify that object that the modification is visible to all code that has a reference to that object. (This is almost tautological since it serves as a semantic definition of what it means for the reference to be to the same object in the first place.) Some numerical languages like Matlab and R deviate from this specifically for array arguments by doing something called “copy on write” where they create a new reference object to the same array memory when an array object is passed into a function call, and if any modification is made they then copy that point. This is not good for writing really high performance for though since argument passing is not free and it’s hard to avoid accidental copies and you can’t write functions that perform visible mutation. So that’s again quite unusual.
Bottom line: what Julia does here is absolutely bog standard and common. It’s exactly what mainstream languages with objects like Lisp, Python and Java all do.
The default behaviour for most languages that I have worked with, is to not locally create another variable label when there is a variable of the same label in the scope. That is what I found unintuitive.
Maybe a more concrete example of the code Ithat I want to run:
0 - function solve!(x, A, b, omega, tolerance, maxiter)
1 0 validate(A, b, x)
2 144 r = b - A*x
3 - iter = 0
4 0 while true
5 0 iter = iter + 1
6 0 x .= x .+ omega .* r
7 5184 r = b - A * x
8 0 if ((norm(r) <= tolerance) || (iter >= maxiter))
9 - break
10 - end
11 - end
12 0 return iter
13 - end
which does a matrix vector product in a loop. I ran the profiler with vectors of size 10. On the left column, you see the allocation in bytes.
I understand the first allocation (line 2), but I would like to to remove any allocations within the loop, particularly one on line 7. What I believe is happening is that in the loop, a vector is being allocated every loop iteration which I dont think should be necessary.
Is there a way to express this such that there are no allocations ? I havent tested if using some library to do the residual computation would get rid of them, but using a custom loop for example as below, did not help (I still see allocations in line 2 below)
0 function compute_residual!(r, A, b, x)
1 for row in 1:size(b, 1)
2 r[row, 1] = b[row, 1] - (A[row, :]' * x)[1]
3 end
4 end
I’m not aware of any languages where that second assignment would modify the existing array. The same memory might be reused as an optimization if there is no other reference to the array, but that’s strictly an optimization. I’d be very interested in examples of where that’s how a language works.
@bertschi’s answer works but does not use BLAS which will be more efficient for large matrices. I think a more canonical way (that’s also easier to parse for a reader) is to use mul! like this:
r .= b
mul!(b, A, x, -1, 1) # performs x[:] = x-Ax without allocation