Surprising (?) behaviour of assignment-update operators

It seems like f -= a is actually exactly equivalent to f = f - a. This surprised me because I’d naively expected -= to be update-in-place, so that I wouldn’t need a View when f is an Array slice, but actually it seems I do:

julia> f = rand(3, 241, 2);

julia> a = rand(241);

julia> function update!(f, a, i, k)
       @. f[i,:,k] -= a
       end
update! (generic function with 1 method)

julia> function update_view!(f, a, i, k)
       @. @view(f[i,:,k]) -= a
       end
update_view! (generic function with 1 method)

julia> function update_view2!(f, a, i, k)
       @. f[i,:,k] = @view(f[i,:,k]) - a
       end
update_view2! (generic function with 1 method)

julia> @time update!(f, a, 1, 1);
  0.235987 seconds (440.13 k allocations: 25.313 MiB, 8.33% gc time, 99.98% compilation time)

julia> @time update!(f, a, 1, 1);
  0.000007 seconds (2 allocations: 2.047 KiB)

julia> @time update_view!(f, a, 1, 1);
  0.162541 seconds (326.76 k allocations: 18.569 MiB, 99.98% compilation time)

julia> @time update_view!(f, a, 1, 1);
  0.000006 seconds (1 allocation: 64 bytes)

julia> @time update_view2!(f, a, 1, 1);
  0.067399 seconds (27.27 k allocations: 1.185 MiB, 99.96% compilation time)

julia> @time update_view2!(f, a, 1, 1);
  0.000006 seconds (1 allocation: 64 bytes)

So update! requires allocations (I guess it compiles to something equivalent to update_view2!, but without the @view on the right hand side?), while update_view! and update_view2! are essentially identical.

In hindsight this makes some sense, but I find it surprising that I need @view on the left hand side of an ‘assignment’ operator. So two questions: Is this expected behaviour? Is it worth giving a warning (edit: I mean a warning in the documentation of -=, etc somewhere) about this behaviour of -= (and I assume +=, etc. are similar)? Although maybe there is a warning somewhere and I just didn’t look hard enough (apologies if so).

Yes - f -= a expanding to f = f - a is “just” behaving like the reality of having to load data before assigning the result back to a variable. This is also documented.

This is also necessary, since - (like all operators in julia) is not special. It’s a regular function and can be overloaded and specialized for whatever type f and a may have (even going so far as returning an object of a different type from both f and a).

@. f -= a not directly writing to f may be surprising at first, but consider that it may not be possible to read & write from/to f at the same time. For example, what if f .- a returns an object of a different type than f? Writing to that wouldn’t neessarily make sense in the first place, but having it be equivalent to f .= f .- a implies that it should anyway.


In general, if you’re trying to write vectorized functions you’re probably still too used to doing that because of some background in R or Python, where this is the default way of writing code (since in those cases the kernel is written in C, to make it fast). This is not necessary in julia (we have fast loops by default after all) - on the contrary, it’s common to write scalar kernel functions and broadcast those directly (or write a looped kernel using Tullio.jl or LoopVectorization.jl directly, if more than scalar access is needed).

I don’t think a warning should be printed here - expecting it to be in-place (or atomic?) by default seems weird to me, since that would mean having to implement -= seperately from - (to indicate the in-place nature) or requiring - to always write to one of its arguments, which seems non-ideal.

2 Likes

@Sukera - thanks for the reply. I agree with what you say. It certainly makes sense for -= to share implmentation with -. And I’ll probably end up re-writing the kernels as explicit loops at some point. Array broadcasting can (sometimes, in some situations) make for somewhat cleaner, more readable code. My main point was that I think this looks odd

@view(f[i,:,k]) -= a

just because I don’t expect to see @view on the left (but as you say, it makes sense because -= has to expand out into using -).

Edit: PS sorry, I only meant warning in the docs somewhere, not for the code to emit a warning - I’ve edited the original post to clarify.

Wait, what? Surely it does work in-place. This is just erroneous benchmarking.

@johnomotani the @time macro is not suitable for microbenchmarks. Try the BenchmarkTools package and remember to interpolate the variables.

Slices on the left-hand-side are always views, and you do not need the @view macro.

It does in this case because the type of f opted into broadcasted assignment (via broadcast!), but this is not the case in general, I don’t think.

The allocation comes from the fact that @. f[i,:,k] -= a expands into f[i, :, k] .-= a which is equivalent to f[i,:,k] .= f[i,:,k] .- a, which slices on the right hand side (and thus allocates).

1 Like

I’m walking around on my phone, so I don’t get to test stuff.:grimacing: The slicing on the right side is the problem then.

f[j, :, k] .= view(f, j, :, k) .- a

should do it then.

But, still, don’t use the @time macro for this.

@DNF

But, still, don’t use the @time macro for this.

Why not? I get that the timing won’t be accurate enough to be useful, but here it’s only the number/size of allocations that we’re interested in. Aren’t the allocations deterministic, so @time and @benchmark would always give the same result (as long as the function is compiled already)?

Edit: No, @DNF is right. @time counts a few extra allocations (I guess maybe because it doesn’t allow interpolating in the global variables??).

Yes. Accessing globals may allocate a Ref, which can be avoided when using BenchmarkTools, but not with @time.