Surprising capture boxing behavior in closure

NHDaly · January 29, 2019, 10:27pm

Hi, I was surprised by the boxing behavior of closure capture, and I’d love to hear about its motivation and discuss whether it’s still relevant.

I came across this when I noticed how it can introduce a race condition in @async calls. I opened an issue here to discuss that:

github.com/JuliaLang/julia

Boxed variables in closures passed to a Task can cause race condition

opened 10:21PM - 29 Jan 19 UTC

closed 06:18PM - 18 Dec 19 UTC

NHDaly

The way we spawn `Task`s via `@async` can lead to a surprising race condition b/…c variables inside the closure are sometimes boxed. This can make it difficult to spawn a `Task` that needs to capture a local variable. Consider this simple example: ```julia function countup() i = 0 while i < 5 @async println(i) i += 1 end end ``` ```julia julia> countup() 5 5 5 5 5 ``` And the problem is nothing to do with the `@async` macro itself; you'd see the same problem using the Task constructor, `Task(()->println(i))`, since in both cases the behavior is related to boxing inside the closure. And since we require you to pass a zero-argument function, capturing variables with closures _seems_ like the solution you're supposed to use. I just checked, and `Go`'s closures behave the same way. However, in `Go` they skirt this problem because the syntax to launch a goroutine is _required to be a function call_, so you have the opportunity to pass arguments to your constructed coroutine. In go, the above example works correctly: ```go func main() { i := 0 for ; i < 5; { go fmt.Println(i) // Since `go` statements require _you_ to call the function, you don't need to rely on capture. This is a _call_ to the existing Println function. i += 1 } time.Sleep(time.Second) } // prints: 0 1 2 3 4 ``` And even if you _do_ need to construct a closure in go, you can still pass the pass-by-copy parameters directly: ```go go func(i int) { fmt.Println(i); }(i) ``` But currently we don't have a way to start a Task with a parameter. The [current Task constructor](https://docs.julialang.org/en/v1.1.0/base/parallel/#Core.Task) requires its argument to be "callable with no arguments". Currently the only way I can think of to mitigate this in my code is to spawn the Task with `@eval`, since it gives you the chance to splice in the _value_ of `x`, thus forcing the pass-by-copy syntax. But that's obviously terrible for a bunch of other reasons. So I think we need some kind of alternative solution here. What do you think? Here are a few proposals I can think of: 1. For `Task(f)`, allow passing a function that takes arguments. - Then, when you schedule it, calling `schedule(t, x,y,z)` would allow you to pass the arguments to `f`, spawning `f(x,y,z)` in a coroutine. - The problem with this is that `schedule(t, val)` already has a definition: https://docs.julialang.org/en/v1.1.0/base/parallel/#Base.schedule (I don't really understand how to use that, though, and the docs say use of `yieldto` "is discouraged.") 2. We could consider allowing `@async` to use the `$` interpolation syntax within its expression body to indicate pass-by-value vs pass-by-reference semantics. This would be kind of like what `@btime` does. In my example, that would become: ```julia @async println($i) ``` ~Sadly this might be breaking, though, for any expressions that already contained quoting and interpolation inside the `@async` body. :frowning: (e.g. `@async println(:($x + 5))`)~ EDIT: I no longer think this would be breaking in any cases... In the provided example, the `$` is clearly interpolating into the quote, not into `@async`, following the same rules as in `@eval`. 3. We could introduce some kind of explicit capture syntax for closures that includes whether to pass-by-copy or -by-reference, maybe like that proposed here: https://github.com/JuliaLang/julia/issues/14959#issue-131893146 But i tend to agree that it complicates things in an undesirable way. Anyone have any other ideas? EDIT: Also, I opened this discourse thread to get more background info on why the boxing is needed 🙂: https://discourse.julialang.org/t/surprising-capture-boxing-behavior-in-closure/20254 ---------------------------- If you're interested, I came across this behavior when translating this cool prime number sieve written in Go into Julia: http://tinyurl.com/gosieve Here's the julia code I wrote: ```julia """ ConcurrentPrimeSieve Based on http://tinyurl.com/gosieve """ module ConcurrentPrimeSieve # Send the sequence 2, 3, 4, ... to channel 'ch'. function Generate(ch::Channel{Int}) for i in Iterators.countfrom(2) put!(ch, i) # Send 'i' to channel 'ch'. end end # Copy the values from channel 'in' to channel 'out', # removing those divisible by 'prime'. function Filter(in::Channel{Int}, out::Channel{Int}, prime::Int) while true i = take!(in) # Receive value from 'in'. if i%prime != 0 put!(out, i) # Send 'i' to 'out'. end end end # The prime sieve: Daisy-chain Filter processes. function main() ch = Channel{Int}(0) # Create a new channel. @async Generate(ch) # Launch Generate goroutine. for i in 1:10 prime = take!(ch) println(prime) # THIS IS BROKEN (race condition): # The `ch` binding inside this async expression _changes_ before it executes! ch1 = Channel{Int}(0) @async Filter(ch, ch1, prime) ch = ch1 end end end ```

I didn’t know about boxing during closure capture before encountering this code.

Here is a simple example. I would’ve expected this to print 0, but it prints 1:

julia> function closure_surprise()
           x = 0
           # I would've expected the closure to be "constructed" here, with the value in `x`
           # captured by copy at _this point_.
           f = ()->println("INNER: $x")
       
           x = 1
           f()
       end
closure_surprise (generic function with 1 method)

julia> closure_surprise()
INNER: 1

Against my expectations, when the outer-scope modifies x, it also changes the value of x seen inside the closure.

Let me expand a bit on why this wasn’t what I expected:

The docs imply x should be a copy:
Julia Functions · The Julia Language
```
function adder(x)
    return y->x+y
end
```
is lowered to (roughly):
```
struct ##1{T}
    x::T
end

(_::##1)(y) = _.x + y

function adder(x)
    return ##1(x)
end
```
If that’s really how it was implemented, _.x would simply be an Int, whose value would be set to whatever x was at construction time.
You can return an anonymous function that captures local variables without wondering what happens when the local variable goes out of scope. And that’s because it’s copied into closure, not passed as a reference to a local stack-allocated variable.
This doesn’t match up with the behavior in the rest of Julia for immutable types like Int: a variable name is supposed to just be a name that refers to a value. Assigning x=0; y=x doesn’t bind y to x; instead they both point at x’s current value, and changing x doesn’t change y.
Even an actual reference (Ref) to an integer variable just becomes a reference to a copy of the variable!
```
julia> begin
         x = 0
         xref = Ref(x)
         x = 1
         xref[]
       end
0
```
Lambdas are pass-by-copy by default in C++, which might be part of why I expected that here.

That said, I now understand that this is following the rules referenced here in the Performance Tips:

This style of code presents performance challenges for the language. The parser, when translating it into lower-level instructions, substantially reorganizes the above code by extracting the inner function to a separate code block. “Captured” variables such as r that are shared by inner functions and their enclosing scope are also extracted into a heap-allocated “box” accessible to both inner and outer functions because the language specifies that r in the inner scope must be identical to r in the outer scope even after the outer scope (or another inner function) modifies r .

And indeed, we can see that is what is happening:

julia> @code_lowered closure_surprise()
CodeInfo(
2 1 ─      x = (Core.Box)()
  │        (Core.setfield!)(x, :contents, 0)
5 │        #76 = %new(Main.:(##76#77), x)
  │        f = #76
7 │        (Core.setfield!)(x, :contents, 1)
8 │   %6 = (f)()
  └──      return %6
)

And we can also see that if we delete the x=1 at the end of the function, x stops being Boxed:

julia> @code_lowered closure_surprise()
CodeInfo(
80 1 ─      x = 0
83 │   %2 = Main.:(##84#85)
   │   %3 = (Core.typeof)(x)
   │   %4 = (Core.apply_type)(%2, %3)
   │        #84 = %new(%4, x)
   │        f = #84
86 │   %7 = (f)()
   └──      return %7
)

The problem I have here, is that I can’t figure out the right way to get the behavior I wanted through closure. Inside the function, I can’t mark it local or anything, and it’s too late to make a copy.

Okay, so all that said, I’m sure there are interesting reasons that I don’t yet understand for why this is done. Are there situations where the boxing is necessary? I’d love to hear those if anyone can spare the time! Thanks!

ffevotte · January 30, 2019, 7:17am

On the other hand, the doc about scoping and let uses closure creation as an example illustrating the use of let to avoid two closures sharing the same variable.

Wouldn’t using let be the right way to get what you want here too?

function closure_surprise()
    x = 0

    f = let x = x
        ()->println("INNER: $x")
    end
    
    x = 1
    f()
end

julia> closure_surprise()
INNER: 0

c42f · January 31, 2019, 6:21am

A while ago there was some discussion that “freezing the captures” might make sense in FastClosures do block support? · Issue #5 · c42f/FastClosures.jl · GitHub. I didn’t get around to implementing that but most of the machinery is there. It also probably makes more sense than what I currently have in that package

I’ve wondered about the exact reason for this as well. It seems like it’s been there from the very first closure lowering in implementing closure conversion, the next lowering pass · JuliaLang/julia@6ed5f4e · GitHub. It certainly makes closures with mutable state easy to construct but that seems a little at odds with modern julia preferring immutable structs, etc. It would be quite easy to explicitly use a Ref for occasions which require mutable state. @jeff.bezanson are there points of interest that we’re missing here?

kristoffer.carlsson · January 31, 2019, 8:58am

Always make passed in values const and require Ref for mutation would be my preference too (Explicit capture of variables in a closure. · Issue #14959 · JuliaLang/julia · GitHub). That could perhaps avoid having to Box?

c42f · February 18, 2019, 4:15am

I tend to agree with you. Coming from C++ I find the scheme-alike lexical scoping rules quite counter intuitive. Perhaps because of a dissonance between imperative vs functional styles. However, any alternative would seem to need a story about how mutually recursive inner functions would refer to each other.

It’s interesting to re-read the discussion at https://github.com/JuliaLang/julia/issues/16727.

We can at least have a macro which introduces independent bindings. FastClosures doesn’t actually do this (mainly because I didn’t fully understand the rules when I wrote it). But that could be fixed; see https://github.com/c42f/FastClosures.jl/issues/11.

thautwarm · September 30, 2019, 4:14pm

Ref is type stable, but sometimes people might need to use a dynamically typed free variable, and its type could change. So totally using Ref instead of Core.Box might not be a good idea.

However, when we can assure the free variable is stably typed, we should use Ref, which is not implemented in Julia.

Tamas_Papp · September 30, 2019, 5:48pm

I think that

julia> C = Ref{Any}(0)
Base.RefValue{Any}(0)

julia> C[] = "something"
"something"

is an option, too.

NHDaly · July 24, 2020, 3:59pm

I just want to follow up and note that a workaround for this behavior during concurrent programming was introduced in v1.4, here:
https://github.com/JuliaLang/julia/pull/33119

As of 1.4, you can use $ to interpolate an argument into a @spawn or @async call, which will prevent this surprise boxing. For example, this code is protected against the surprise shared reference, and will work like we intended:

function countup()
    i = 0
    while i < 5
        @async println($i)
        i += 1
    end
end

julia> countup()
0
2
1
3
4

It only works for @async and @spawn, not for arbitrary lambdas, but I think it’s helpful, and you can follow the same pattern using let for arbitrary lambdas.

NHDaly · July 28, 2020, 7:27pm

We also talked about the performance implications of this again today at JuliaCon, and we wondered whether there aren’t cases where the compiler can at least turn x into a Ref{Int} instead of a Box.

Is there an existing discussion about efforts to do this kind of analysis in the compiler?

CC: @oxinabox, @Syx_Pek, @Oscar_Smith, @oschulz

For example, in this case we’d expected the compiler to deduce x as a Ref{Int}:

julia> function closure_surprise()
           x::Int = 0
           # I would've expected the closure to be "constructed" here, with the value in `x`
           # captured by copy at _this point_.
           f = ()->println("INNER: $x")

           x = 1
           return f
       end
closure_surprise (generic function with 1 method)

julia> f = closure_surprise()
#4 (generic function with 1 method)

julia> dump(typeof(f))
var"#4#5" <: Function
  x::Core.Box

julia> @code_warntype closure_surprise()
Variables
  #self#::Core.Const(closure_surprise, false)
  #4::var"#4#5"
  f::var"#4#5"
  x::Core.Box

Body::var"#4#5"
  ...

c42f · August 4, 2020, 3:51am

The issue is that closure conversion is done in lowering which happens well before any type inference. So looking at @code_warntype gives a false sense of what the compiler knows when it’s generating the closure struct.

In the current compiler I imagine it may be possible to support this particular case and generate a Ref rather than a box because the typeassert is purely syntactic (and therefore available to the code in lowering).

But in general it’s hard to attack the boxing problem systematically without moving some part of closure conversion out of lowering and making it happen during type inference/optimization. For example, https://github.com/JuliaLang/julia/pull/31253

Topic		Replies	Views
RFC: Some Ideas to Tackle #15276 - performance of captured variables in closures Internals & Design inference , type-stability , corebox	65	4485	February 16, 2024
Suggestion: Explicit anonymous functors? Internals & Design	23	682	February 4, 2024
Can someone explain closures to me Internals & Design performance , type-stability , closure	13	3052	November 2, 2023
Language restriction to solve 15276? Internals & Design	48	3845	May 11, 2018
Referencing local variable before assignment results in unexpected behavior Internals & Design question	58	5259	April 7, 2021

Surprising capture boxing behavior in closure

Related topics