! notation for individual modified variables

proposal

#1

I’m a big fan of the ! style recommendation for functions that modify their parameters.

For certain functions with many arguments, functionname!(foo, bar, baz) can be ambiguous: How do I know if foo, bar or baz gets modified, or if potentially multiple parameters get modified?

I want to start a discussion about a style convention that clarifies which parameters will be modified. I imagine appending a ! to each modified parameter. If functionname! modifies foo and baz, its declaration would read functionname!(foo!, bar, baz!).

I’m uncertain whether this is a reasonable thing to do for all !-functions, or only for !-functions that are ambiguous about the modified parameters.


#2

That’s not possible in general, since you can have variables modified in one function call but not in another. In fact, I would venture to say that’s quite common.


#3

you can have variables modified in one function call but not in another

Does “variables” refer to the function parameters here?

If so, I think the same holds for !-functions (they do not necessarily modify their parameters). The idea is more to point out which functions could potentially modify their parameters. Likewise, it seems like a good idea to me to point out which particular parameters in particular could potentially be modified.


#4

No, for functions you’d only name it with a ! if it does modify. Functions can do multiple things by dispatch, but it’s only sensible to make them do “about the same thing”.

For variables, there’s a lot of very obvious cases where this doesn’t make sense. Here’s a quick example:

A_mul_B!(C!,A,B)
A_mul_B!(E!,D,C!)

unless you want to have as part of the style doing:

A_mul_B!(C!,A,B)
C = C! # aliasing as part of the style?
A_mul_B!(E!,D,C)

then you’re going to have lots of variables being named ! when they aren’t being mutated. This has to happen because, why would you mutate variables you aren’t going to use? The only reason would be output, but this means any intermediate cache would either have two names or be misnamed. That’s a pretty big case to have no good answer for.


#5

@ChrisRackauckas that clarifies it. Do you think functions like your A_mul_B example make up the common case? Looking through https://docs.julialang.org/en/release-0.4/stdlib/, it actually looks to me like the very most functions just modify their first parameter in an obvious way. I wonder if there is a case for the parameter! notation for the remaining niche case of more than two modified parameters.


#6

That example I showed is very common because in any case where you’d use a mutating function to write into a tmp!, is there a high chance you’re going to use tmp! afterwards? The answer of course is yes otherwise you would not have computed tmp!, but if you’re going to use tmp! you’re not necessarily going to modify it again, so you’re going to have a tmp! arg that’s not mutated in a bunch of other function calls. Once you see this logic, then you see how this will happen in almost any case where a larger function internally uses a mutating function. If you don’t believe me, apply this naming scheme to any non-allocating function that you’ve already written and you’ll see how all of the cache variables play a dual role of “written into” and “used as a variable”.

Yes, the Julia convention is to modify the values in the front. This keeps things tidy and makes it easy to know what’s modified. There could be some edge cases but in general it works well.


#7

Generally the style follows something like this,

function magic!(storage::AbstractVector, obj::Any)
    storage[:] = obj
end

The first argument is the one modified in-place.


#8

@ChrisRackauckas ah, yeah, I realize now that I didn’t properly consider nested function calls to other mutating functions. Thanks for the detailed explanation!


#9

It would be nice to have something in f90 style: intent(in), intent(out)
this would also allow to “protect” function parameters from unindented modification


#10

The object model in Julia does not really work like this (since everything is passed by value, adding annotations to the variable doesn’t really make sense). The way to do this would be to wrap the things you pass in into some immutable wrapper, in other words, modify the actual object that is being passed.


#11

You are obviously right in general and the and the fact that functions can be easily made to be very general with respect to their arguments is one of the beauties of Julia. However, some (many) times what is being passed around are plain old Float64 matrices which, I believe, are passed by reference. In this case annotations may make sense.


#12

No, they are still passed by value. If they were passed by reference the following would print [1,2] and not [2,3]

function f(x)
    x = [1, 2]
end

x = [2,3];
f(x);
print(x)

Passing by reference means that the variable x is the same outside the function as inside. In Julia, this is never the case, they are two completely separate variables that happen to be bound to the same (mutable) object.


#13

You are of course right, but for programmers coming from the C/C++ world, “passing by reference” is the best approximation of the actual semantics. This occurs eg in the Noteworthy differences section of the docs (twice, for Matlab and C++):

Julia arrays are assigned by reference.

Julia values are passed and assigned by reference.


#14

I don’t think so. It seems to be a source of confusion where people want to put modifiers on the variable while, in reality, it is the object that needs the modifiers.

Yeah, that documentation is using bad terminology and should be fixed

Edit: https://github.com/JuliaLang/julia/pull/26427


#15

Unless I misinterpreted, I think the original post is about the naming of the formal parameters in a function 'declaration". It’s not about using any sort of special name for passed-in arguments


#16

I would read OP in the same way. Regardless of what OP meant, I really like the idea: It is a convention that would allow to read off a lot of behavior from the declaration / first line of the doc-string, in case of functions that need to mutate several of their parameters. The API would be unchanged, so this would not even be breaking (except for mutated keyword arguments).


#17

@kristoffer.carlsson . Sorry, sloppy wording on my side (and the documentation in some places may not be entirely clear). I believe I have a problem with “mutable” (English is not my mother tongue).

function f!(x)
    x[1] =1
end

x = [2,3];
f!(x);
print(x)

You say that x is mutable but in your example going in and out of f does not change it (it does not “mutate”). In my example going in and out of f! changes it (it “mutates”). In my case the backgound is not even C/C++ but Fortran (not even the latest incarnations). Imagine how confused I sometimes am (especially the memory allocation is for me still quite puzzling)…


#18

@Pier I’m not sure if your post was meant as a question, but here is my understanding of Julia’s argument passing model: According to the docs

Julia function arguments follow a convention sometimes called “pass-by-sharing”, which means that values are not copied when they are passed to functions. Function arguments themselves act as new variable bindings (new locations that can refer to values), but the values they refer to are identical to the passed values.

This means that in the code

function f!(foo)
    foo[1] =1
end

bar = [2,3];
f!(bar);
print(bar)

bar gets passed by sharing to f. This means that now, foo and bar are two names that are bound to the same array in memory. Setting foo[1] will therefore set the first index of the very array that bar is bound to. Therefore, print(bar) will print [1,3].

However, in the following code

function f(foo)
    foo = [1, 2]
end

bar = [2,3];
f(bar);
print(bar)

we are binding foo to a completely new array. You can think of this as the foo pointer being moved to another place in memory. This is because we do not access the value of foo, but reassign foo itself.

In contrast, if this code would be passing arguments by reference, foo and bar are not bound to the same object, they are bound to the same internal pointer that is bound to the object. In such a case, reassigning foo would simply rebind that internal pointer to a new memory location, which is a modification that then also applies to bar.
Does that make any sense?


#19

It makes perfect sense. The behaviour of f! and f are clear to me, and mine was not a question but an observation. And here we come to the initial point of the post. If we just look at f(foo) (imagine a long intricate function calling other functions) we cannot know if the caller will see [1, 3] (x is changed in the caller) or [2, 3] (x is unchanged in the caller). I think that the original intent of the post was to pose a question like “is there a way, or can there be a way, to make the expected result more evident/explicit/predictable by just looking at the function arguments” ?


#20

As far as I can tell this is already allowed:

julia> function f2(a!) a[1] = 2; end
f2 (generic function with 1 method)

julia> a=[5]
1-element Array{Int64,1}:
 5

julia> f2(a)
2

julia> a
1-element Array{Int64,1}:
 2

I agree, decorating the input argument that will be modified inside the function makes in my opinion things quite clear.

Note added later: This is weird: note that I am referring to a, not a!, but there is no error. [Note added even later: Oops! No, it was a different variable that was being modified inside the function.]