Type stability in closures

I’m doing a bunch of code generation and came across a strange behaviour involving type stability, closures and variable captures. I reduced the problem to the following example:

function f(a, b)
    _f = (x, y) -> begin
        c = x * y
    end
    return _f(a, b)
end

is fully type stable, i.e. Base.return_types(f, (Float64, Float64)) returns Float64, as I would expect, and that’s good. However,

function f(a, b)
    _f = (x, y) -> begin
        c = x * y
    end
    c = _f(a, b)
    return c
end

is type-unstable, i.e., Base.return_types(f, (Float64, Float64)) returns Any. The problem seems to be the double usage of the variable name c, causing julia to somehow capture the variable and boxing it. Renaming either of the two cs to something else makes the function type stable again.

Is there any way to prevent julia from trying to capture something that it shouldn’t? This would probably be my ideal solution (because of world-age problems, I can’t simply define the closures as functions in an outer scope either). I’ve tried playing around with let but didn’t get anything to work. This was the approach in some of the other topics I found that seemed related.

Since this problem comes up for me in code generation, I’d rather not have to do variable renaming, because that would increase the complexity of my code by a lot, just to workaround something that honestly feels like a bug in the language.

2 Likes

Strictly speaking, no because capturing a variable is not different from the routine existence of a local variable across nested local scopes, and type unstable variables are allowed in dynamically typed languages. The issue is that it is a bug in this case, and captured+reassigned variables could at least be highlighted by a linter.

If you intend the 2 c to be separate variables, then you need to disambiguate them. If you don’t want different names, you could use local declarations, including implicit ones in let and for headers. Code generation could also involve method-wise gensym, though you’d have to be careful not to change symbols for intended captured variables. The only way to avoid the trouble of disambiguating names is to not write nested local scopes e.g. lifting the method into the global scope for function-like objects to hold data.

This might be worth making a post about. An unnecessary closure shouldn’t really be a solution to this, in fact it’s implemented as defining a globally scoped type and methods for function-like objects.

Do I understand correctly that you’re saying that the function/closure does not open a new scope and so even though c is never defined outside the closure, it is already declared (inside the closure), and so julia cannot statically infer the type?
I’m still not sure how I would use a let block here at all, since I can’t use let c=c like in other examples, because c doesn’t exist before the closure. Can you modify my code example?
gensym is a good hint, I will look at this, so far I’ve generated my own unique symbols from random UUIDs.

Lastly, I have not actually tried generating the closures as functions in the global scope. I assumed this would lead to world age problems (and it surely would, when evaling them one by one), but it might work if I bunch everything together and eval it together. I will report back once I’ve tested this.

In any case, thanks for the reply.

function f(a, b)
    _f = (x, y) -> begin
        let c = x * y
            c
        end
    end
    c = _f(a, b)
    return c
end

… or …

function f(a, b)
    _f = (x, y) -> begin
        let c
            c = x * y
        end
    end
    c = _f(a, b)
    return c
end
2 Likes

No, the method is indeed a new nested local scope. It however shares a local variable named c with the outer local scope.

The line c = _f(a, b) does.

Julia has a method call-wise compiler. You can’t usually infer a variable assigned across 2 separate calls when inferring one call at a time. c only calls _f once to effectively compute a * b so it’s feasible here, but the compiler currently works for the general case where _f could be called for a variety of other inputs that adds to the types that c needs to accomodate. Currently the lowerer just boxes c when it notices it is reassigned in the definition; even if you annotate the variable’s type so it doesn’t need to be inferred much, it’s still boxed, there’s just extra code to convert and typeassert when writing to the box. While that and a few other things could be improved, it won’t ever come close to making the general case inferrable.

Closure methods are defined while the outer method is defined, so bunching methods together in 1 eval seems like a reasonable mimicry.

1 Like

What confuses me is that this line comes after the closure, so I assumed it should be unable to affect type stability inside the closure… But I guess that’s not how julia compiles things then.

bunching methods together in 1 eval seems like a reasonable mimicry.

Ok, so the top-level function I’m generating (f() in this case) I put into a RuntimeGeneratedFunction already, to be able to call it directly. And it looks like this doesn’t take multiple function definitions at once.

function foo()
    expr = Meta.parse("begin
    function _f(x, y)
        return c = x * y
    end")
    expr2 = Meta.parse("
    function f(a, b)
        c = _f(a, b)
        return c
    end")
    f = RuntimeGeneratedFunction(@__MODULE__, @__MODULE__, Expr(:block, expr, expr2))
    return f(5, 6)
end

This throws an ArgumentError

Quick example of the type instability:

julia> function foo()
         bar(a, b) = (c = a*b)
         c = bar(1, 2) # assign here to declare existence in scope
         println(typeof(c))
         bar(1, 2.3)
         println(typeof(c))
         bar(1.0, 2im)
         println(typeof(c))
       end
foo (generic function with 1 method)

julia> foo()
Int64
Float64
ComplexF64

Again, it only seems feasible to infer c in your example because you only call _f once and it doesn’t escape to be called again. I don’t think Julia’s escape analysis knows that currently, nor is this one case enough of a reason to do that work.

I never used RuntimeGeneratedFunction so I can’t help you there, but I did spot that your first expr is an incomplete begin block.

I’m sorry if I’m slow, but I still don’t understand this. In your example, the code is type unstable, that’s true. I would expect the outside c to be type unstable. However, I would expect each call to bar to be compiled as a separate method, with its separate “instance” of c inside bar’s function body, and with a type stable inside c for each method call.
And if I change the definition of bar to assign to d instead of c, that’s exactly what happens. So I don’t see how the later and outer scope uses of c affect the body of bar at all.

I never used RuntimeGeneratedFunction so I can’t help you there, but I did spot that your first expr is an incomplete begin block.

You’re right, thank you. But the problem is simply that RuntimeGeneratedFunctions expects an expression starting with a function directly, so it doesn’t even get to that point. Using eval there would simply fail directly when calling f because it’s too new.

The c inside bar is the exact same variable as the c inside foo. Type inference (or lack thereof) for c must be consistent across the foo() call and all possible bar calls.

But if the 1 c inside foo includes the c inside bar, then why is each bar call type stable if I replace the c by d? Wouldn’t there then be only 1 d inside foo, which is the one inside bar, which has to be consistent across all its call signatures?

d isn’t inside foo’s scope at all in that edit.

julia> function food()
         bar(a, b) = (d = a*b)
         println(d)
         c = bar(1, 2)
       end
food (generic function with 1 method)

julia> food()
ERROR: UndefVarError: `d` not defined in `Main`

Nested scopes can have local variables from outer local scopes, not vice versa. This is pretty common across languages.

Ok yes, that makes sense. I will sleep on it. I think the ideal solution for my specific problem would be the approach of generating the functions in the global scope, but I’m not sure if I can do this without running into world age problems and without more or less reimplementing RuntimeGeneratedFunctions.
Otherwise I might just go with the renaming approach after all.

Thanks for the patience with your replies <3

This version actually fits my problem really well, and seems to work well! Thanks a lot.

1 Like