Help understand boxing

I am trying to resolve a type stability problem in a function where some local variables are showing up as the type Core.Box when I run @code_warntype. I’ve narrowed the cause to where I use a comprehension or generator with those variables as inputs, and those variables can be modified later in the function. ’

Here’s a MRE. In the second function NoBox() with the line x = 1.0 commented out, x is correctly reported as Float64.

function box()

    x::Float64 = rand()
    y = 0.0

    if x > 0
        y = mean((round(x) for _ in 1:100))
    end

    x = 1.0

    return(x, y)

end

function nobox()

    x::Float64 = rand()
    y = 0.0

    if x > 0
        y = mean((round(x) for _ in 1:100))
    end

    #x = 1.0

    return(x, y)

end

As a workaround, if I move the equivalent in my code of the line y = mean((round(x) for _ in 1:100)) to its own function, it also works as expected and runs several times faster.

I usually try to break up my code in a way that is pretty natural to function barriers, but this originally seemed like too trivial a step to separate out. I’d appreciate any insight into why this is happening and if there’s a proper way to do this.

3 Likes

This is so confusing for me too:

julia> function box()
           x::Float64 = rand()
           y = 0.0
           if x > 0
               y2 = mean((round(x) for _ in 1:100))
           end
           x = 1.0
           return(x, 1)
       
       end
box (generic function with 1 method)

julia> function no_box()
           x::Float64 = rand()
           y = 0.0
           #if x > 0
           #    y2 = mean((round(x) for _ in 1:100))
           #end
           x = 1.0
           return(x, 1)
       
       end
no_box (generic function with 1 method)

I believe this is the famous variable captured in closure bug.

3 Likes

Performance Tips · The Julia Language has some more information about this particular problem.

Yeah but in this case the type annotation does not help.

I’m aware of the captured variable bug but it’s kinda annoying that it’s so re-occurring. Here it seems that the comprehension captures x?

What helps is.

 function box_fixed()
     x::Float64 = rand()
     y = 0.0
 
     let x=x
          if x > 0
              y2 = mean((round(x) for _ in 1:100))
          end
     end
     x = 1.0
     return(x, 1)
 end

I often encountered cases where this has huge performance penalties and as a user this is really worrisome.

Correct, Meta.@lower is your friend here:

julia> Meta.@lower mean(x for _ in 1:100)
:($(Expr(:thunk, CodeInfo(
    @ none within `top-level scope`
1 ─       $(Expr(:thunk, CodeInfo(
    @ none within `top-level scope`
1 ─      global var"#5#6"
│        const var"#5#6"
│   %3 = Core._structtype(Main, Symbol("#5#6"), Core.svec(), Core.svec(), Core.svec(), false, 0)
│        Core._setsuper!(%3, Core.Function)
│        var"#5#6" = %3
│        Core._typebody!(%3, Core.svec())
└──      return nothing
)))
│   %2  = Core.svec(var"#5#6", Core.Any)
│   %3  = Core.svec()
│   %4  = Core.svec(%2, %3, $(QuoteNode(:(#= none:0 =#))))
│         $(Expr(:method, false, :(%4), CodeInfo(
1 ─     return x
)))
│         #5 = %new(var"#5#6")
│   %7  = #5
│   %8  = 1:100
│   %9  = Base.Generator(%7, %8)
│   %10 = mean(%9)
└──       return %10
))))

IMO, this is one of the most surprising and “leakiest” parts of lowering. There is no indication that a closure is created when you write a comprehension, so people don’t know to look for one when trying to hunt down the source of the box. Ideally, there would be a way to stabilize captures in comprehensions which doesn’t require solving the entirety of the closure capture problem.

1 Like

There is no indication that a closure is created when you write a comprehension, so people don’t know to look for one when trying to hunt down the source of the box.

This is certainly part of my confusion. I was trying to comprehend (no pun intended) how the documentation linked by @Zentrik related to this since I did not think I was creating or returning a sub-function but the evaluated result.

Well, that certainly feels awkward and counterintuitive. I’ll stick to separating out the function. It’s a shame because comprehensions, generators, map, etc. are all such workhorses.

1 Like

I do wonder whether we could fix this by lowering to opaque closures. this is a case where we make the closure and immediately execute it, so making a closure does lose information.

Unless, of course, you keep the generator around:

function box()
    x::Float64 = rand()
    if x > 0
        y = (round(x) for _ in 1:10)
    end
    x = 1.0
    return y
end

collect(box())

return a vector of 1.0s.

3 Likes

The scoping rules say that comprehensions make new local scopes, which use outer local variables; they are built on methods so that requires capturing. Expr(:method... showed up in the lowered code to make the closure. Closure behavior and the performance pitfalls are plainly documented in the sections on comprehensions and generator expressions in Single- and multi-dimensional Arrays · The Julia Language.

If the value of x at a particular point is preferable to capturing the variable x in general, then this is semantically the proper way.

I think this is a fair description because there are feasible ways to mitigate this, but “bug” doesn’t get across the difficulty of mixing apparently straightforward type inference with performant closure implementation.

For example, sgaure just pointed out that immediate execution for a particular value cannot be an automatic optimization if the generator persists. Unfortunately, even y = mean((round(x) for _ in 1:100)) does not guarantee that the generator is never reused because mean could cache the generator somewhere for all we know. I assume some compiler effect analysis can address that, but I’m not sure how reliable that is, and it’s moot because boxing is still determined at lowering, far before a call signature is available to trigger compilation. We’d prefer something that works unconditionally and safely.

My recommendation in this respect is not to reassign local variables or function arguments inside a function. It can introduce unexpected behavior in several instances, given the issue of variables captured in a closure.

Example:

function foo()
    x = [1.0, 2.0]
    x = [1, 2]

    [mean(x) for _ in 1:10]
end

@code_warntype foo()  #type unstable

Thus, the following is fine:

function nobox()
    x = rand()
    y = 0.0

    if x > 0
        y = mean(round(x) for _ in 1:100)
    end

    z = 1.0             # don't reassign`x`

    return z, y

end
1 Like

My recommendation in this respect is not to reassign local variables or function arguments inside a function.

I appreciate that. The example I provided does not convey what I’m doing with the real code, where the final value of x is constrained by an intermediate evaluation of y(x). A little more specifically I am solving and returning a point <x, y(x), ... > where x and other dimensions must be adjusted to zero for certain ranges of y(x).

There may indeed be cleaner ways of writing this, and it does work fine to tuck y(x) into a separate function. What’s confusing being new to Julia is that y(x) introduces type stability for x at all since it isn’t modified at all within the comprehension.

1 Like

Note that the comprehension is not introducing the type instability. It’s the fact of redefining a variable AND using a closure.

For instance, the second function is type unstable once you add a function definition inside a function, even if you type annotate x.

function foo()
    x = 1
    x = 1
        
    return x
end

@code_warntype foo()            # type stable


function foo()
    x::Int64     = 1
    x            = 1
    bar()::Int64 = x::Int64
    
    return bar()
end

@code_warntype foo()            # type unstable

I know it’s quite confusing and the issue has been around for a long time. This hints that there’s no easy solution to the problem.

You can address the problem by either i) splitting a function into multiple separate functions, ii) not reassigning variables, iii) including all variables as arguments inside the closure (even functions).

# example of i)
bar(x) = x

function foo()
    x     = 1
    x     = 1    
    
    return bar(x)
end

@code_warntype foo()            # type stable


# example of ii)
function foo()
    x      = 1
    z      = 1
    bar()  = z
    
    return bar()
end

@code_warntype foo()            # type stable


# example of iii)
function foo()
    x      = 1
    x      = 1
    bar(x) = x
    
    return bar(x)
end

@code_warntype foo()            # type stable

1 Like

This link could help you.

It’s from a book I’m writing with the basics of Julia. Note that the book is unfinished in terms of content, writing, etc (the link is not even public yet). Hopefully, a preliminary version will be ready by mid/end of the year.

3 Likes

4 posts were split to a new topic: Formatting code examples in text with Franklin.jl

Thanks, I also found a way (see result here) but is also very cumbersome.

1 Like

Please note that (round(x) for _ in 1:100) is a generator with a closure, and [round(x) for _ in 1:100] is a comprehension which immediately executes its [implicit] generator.

Actually, the type-annotation does help. If you look at the lowered code, you will notice that x is typeasserted to Float64 every time it is accessed. This means that the code is type-stable, and dynamic dispatch does not occur when calling functions on x.

This can be confirmed by looking at @code_warntype box(): although accessing the box contents retrieves a value of type Any, it is immediately typeasserted to Float64 and all subsequent calls are type-stable.

As a result, the code is far more performant than if the type-annotation wasn’t there: box has the same performance as nobox, other than the allocation required for the Core.Box. Without the type annotation, it would be about 100x slower.

In this case, this is not a bug; the creation of a box is semantically necessary since it’s impossible for lowering to know syntactically whether mean will store the generator or not.

We could make the Core.Box type-parameterized (similar to Ref), which allows additional compiler optimizations to work, but this has some challenges. I want to solve those challenges but life’s been in the way.

In your example, the insertion of a box is a “bug” (in the sense that x doesn’t actually need to be boxed to meet language semantics, but it’s boxed anyway because of how the capture logic is currently implemented).

It’s my opinion that it’s feasible to fix this bug, and I opened an issue for it, but it might be an intensive affair to edit lowering without breaking other things.

2 Likes

There’s an optimization for comprehensions that’s possible but currently is not implemented, since comprehensions use special syntax and are known to immediately execute their generator.

When we intend to capture a variable’s instantaneous value, the let block is indeed the proper way to go. FastClosures implements a macro which automates the creation of let blocks. Perhaps this could be edited to work also on generator expressions. If the macro was given a better name (currently it’s called @closure, which does not accurately reflect what it does), perhaps it could be part of Base.

If that is done, then lowering could emit a call to that macro whenever it encounters a comprehension.