One of the most common complaints arising in this group is issue 15276, performance of captured variables in closures · Issue #15276 · JuliaLang/julia · GitHub, which seems to cause all kinds of baffling performance hits. I’m wondering whether this issue could be solved by adding the following restriction to the language: any variable captured in a closure must have the same concrete type for its lifetime.
First, does this restriction solve the issue? In other words, if this restriction were added to the language, is there still a case when a variable whose type can be inferred in the outer scope would cause type instability in the closure?
Second, if this restriction were imposed, how much useful language functionality is lost?
This issue is extremely annoying, but it’s not clear that it requires a language restriction to solve it? It has been milestoned to 1.x, so presumably that means it doesn’t require breaking changes to fix, though it would be great to have some sort of confirmation.
There are always workarounds for this bug (but they’re annoying and can be surprising). Eg, worst case scenario, you could wrap the variable you’re using in the closure in a constant reference (or constant vector of length Threads.nthread()), or (probably a much better idea) define a new type of struct for the closure, and define a method for it).
That is, instead of capturing x::Float64 in the function y -> f(x,y), define
struct fclosure{T}
x::T
end
(fclose::fclosure)(y) = f(fclose.x, y)
Although let blocks are normally good enough.
Similarly if you need to change concrete types for a variable caught in a closure, you could work around a language restriction via wrapping it:
mutable struct FloatOrInt
x::Union{Float64,Int}
end
and then pass a concrete instance of FloatOrInt as your concrete type if you need to switch between the two types for some reason.
I think the main limitations are that people can be caught by surprise, and having to work around things can be annoying.
I have no idea how hard or how much time it takes to implement different solutions (ie, working on the actual type inference). Julia developer time and effort is a precious resource. I’m amazed by what they’ve done so far, so I trust their judgment on where they spend it!
I think this is the big thing. It is a completely silent performance regression that can occur basically anytime you use a comprehension or generator. The only way you’ll notice is by noting that performance could be better and run @code_warntype through your code.
I’m wondering, would it be possible to emit a compile-time warning in such cases?
One would need some explicit syntax for “make it work” that suppresses the warning and the overhead for it, but this would make most of the pain go away.
An interesting option would be to require that captured locals without explicit types are type-constant. That would allow intersecting all the bits of inferred info about their type instead of unioning, which is a pretty massive improvement in inferability.
For explicitly typed locals, one would of course respect the type annotation. So instead of jumping through Ref hoops to allow loosely typed captured locals, one would simply annotate them with the loose type one wants. This would, of course just be the type annotation of the local on the closure struct.
Regarding Elrod’s remark to trust the judgment of the core developers: I agree. But I have noticed that the core developers are (understandably) averse to breaking changes and to restricting functionality. Therefore, I am expressing my point of view to the core developers, speaking as a random user, that I would prefer to see 15276 fixed now rather than later even if the fix involves breaking changes and restricted functionality. But other users: feel free to disagree!
With regard to foobar_lv2’s suggestion about compile-time warnings, in a conversation last year on this discourse site, Yu Yichao said that the core developers will never agree to compiler warnings for legal Julia code, even if the code is most likely non-performant or even most likely erroneous. So I guess that ship has already sailed.
Finally with regard to Stefan Karpinski’s remarks about captured variables being “type-constant”, is that the same as saying that their concrete type is not allowed to vary during their lifetime?
What are the reasonable usage scenarios where a captured variable would not be type constant? If it is rare, it seems that in those cases, the user could always just use capture a mutable stuct holding an Any. I would rather have Julia v1.0 be more strict, because allowing type instability could always be added in minor releases, but can’t be removed outside of major releases.
Yes. As usual, it’s a bit subtler than one might think. Rather than just figuring out the type and pretending it is a local annotated with that type. It’s also not quite the same as making it a local annotated with the concrete type of the first value assigned to it. The former would be too lax and the latter would try to do auto-conversion which is not desirable.
What you’d want is a local variable that will throw an error if you try to assign a value of a different type than the type of the first value assigned to it. We don’t have anything like that currently.
Would this type of local variable be useful in general (not just for issue 15276)?
Naively, I imagine these could catch errors where the programmer’s intention was to have type-stable variables,
but still allow “clean” code with inferred typing without requiring explicit type annotations.
With explicitly declared types, Julia generally does auto-conversion which has been the most friendly and convenient behavior in general. People like being able to use an integer value to provide a float field, for example or write a[i] = 0 to zero an element of an array regardless of its element type. Since convert errors for values that cannot be accurately converted is safe as well.
What are the reasonable usage scenarios where a captured variable would not be type constant?
Writing a quick and dirty script that should “just work”. The kind of thing where python or bash performance is totally acceptable, where everything may end up as Any. It is imho pretty cool that julia also fits that niche (but I came for the “more productive C”, as ChrisRackauckas put it). Big dirty inhomogeneous structure of arrays, mutation of outer state is desired; due to circumstances, the capture can oscillate between Vector{Any} and Vector{something_concrete} (even though the elements might all be something_concrete).
Yu Yichao said that the core developers will never agree to compiler warnings for legal Julia code
Command-line switch or environment variable? Then I can run julia -Wall, and let the compiler help me figure out my problems. Bonus points if we get a “suppression” syntax (e.g. @suppress_warn) for cases that are “known good” despite warnings (useful for CI, both in julialang and in packages; -Wall tends to become too chatty, and if it begins spamming me with warnings about base/stdlib/bootstrap-code, then it becomes annoying). Obviously suppression would be overridden by command-line switch.
The simple answer is no, making captured variables type-constant does not fully solve the problem. Variables can be captured before they’re assigned, so we need to make a closure of the right type before we know what value (if any) will be assigned. But there do exist cases where that rule would make it easier for the compiler to optimize things.
I think adding a rule like this only for captured variables is too ad-hoc. The idea of adding a type-const declaration is intriguing. But I’m not sure how many cases there are where that kind of declaration would work, but x::T or x::typeof(y) etc. would not. Just brainstorming here: maybe we could use the syntax x ::= 0 for both assigning x and declaring it type-const — it’s a bit more useful if you have to initialize the variable as well.
If you’re going to abandon the principle that “only values, not variables, have types”, then why not go all the way and allow C-style type declarations? Let the declaration Int x be equivalent to replacing every occurrence of x by x::Int in the current scope. Simultaneous assignment could be done with Int x = 0, or more generally with typeof(a) x = a.
The last statement would capture the type of a as the declaration statement is executed. I.e. the statement would be equivalent to something like const x_type = Core.typeof(a); x = a and then replacing all occurrences of x by x::x_type.
While most performance problems can only be detected after type inference, this one seems like it might be detectable by static linting. Is it sufficient to notice a generator/closure constructor being applied to a variable without type annotation? Maybe tricky to track the origin through all the SSAValues…
With regard to Jeff Bezanson’s explanation of why my proposed language restriction won’t solve the problem, how about the following. Introduce a new operator ::: which means that a variable occurring in an expression (presumably inside a closure) has a promised type, something like this:
function t(s)
a = s / 9
x = Set((a:::typeof(s))^2 + a * i for i = 1 : 10)
return x
end
Would introduction of this operator make the problem go away?