RFC: Some Ideas to Tackle #15276 - performance of captured variables in closures

uniment · March 3, 2023, 10:19pm

~~Thank you, this is the type of rebuttal I’m in need of.~~ Edit: Sorry, appears to be the same error.

~~Maybe I’m lacking imagination, or maybe it’s because I lack the skills to reason about this formally, or maybe I’m still suffering COVID brain-fog—but with my informal reasoning~~ I can’t imagine any cases (excepting where @goto is used) where it cannot be reasonably easily determined that {there exists no syntactical assignment of a captured variable after the closure is syntactically declared, and therefore the variable doesn’t need to be boxed since it’s impossible for it to change value during the closure’s lifetime}. Using structured code, the sequence of expressions in the function body maps fairly linearly to its execution (except of course, lifting closure definitions to global scope). Can you offer any examples (without @goto) where my intuition is failing me, or is it the accommodation of @goto which makes Idea 1 impossible?

Admittedly, I hadn’t even considered @goto until reading your description.

P.S. I decided to explore how closures function in the presence of @goto. While exploring that, I contrived some degenerate code which seems to expose an obscure bug in the boxing logic.

Click to expand for details

Code:

julia> function baz()
           j=0
           @label start
           i = j + 1
           println("i is now ", i)
           if j > 0; @goto stop end
           f = ()->i
           @label stop
           j += 1.0
           if j < 3; @goto start end
           return f, nothing
       end
baz (generic function with 1 method)

julia> f, _ = baz()
i is now 1
i is now 2.0
i is now 3.0
(var"#25#26"{Int64}(1), nothing)

julia> f()
1

The capture f.i is a constant 1, not reflecting how the identifier i has changed in the parent scope of baz. One might argue that this isn’t actually a bug, but such an argument becomes strained when you try changing baz’s last line to return f, i, in which case f.i is a Core.Box(3.0).

~~nbd, but interesting.~~

Mason · March 3, 2023, 10:31pm

It’s definitely a bug, and it doesn’t require @goto.

julia> function baz2()
           j = 0
           local f, i
           while j < 3
               i = j + 1
               println("i is now ", i)
               if !(j > 0)
                   f = () -> i
               end
               j += 1.0
           end
           f, nothing
       end;

julia> baz2()[1]()
i is now 1
i is now 2.0
i is now 3.0
1

julia> function baz3()
           j = 0
           local f, i
           while j < 3
               i = j + 1
               println("i is now ", i)
               if !(j > 0)
                   f = () -> i
               end
               j += 1.0
           end
           f, i
       end;

julia> baz3()[1]()
i is now 1
i is now 2.0
i is now 3.0
3.0

Would you mind filing a bug report?

Sukera · March 3, 2023, 11:04pm

I remember this from some issue, but I can’t find it right now. The underlying reason for the behavior is that conditional definition of the closure itself is not really possible, especially not in a loop, because there’s no way to tell in lowering which version of i needs to be captured. This then results in the first value being captured, instead of a continually updating one. You can get a similar behavior with just an if/else surrounding the closure, whose value would depend on a passed-in variable, I believe.

Sukera · March 3, 2023, 11:17pm

I mean, this is really just the same that I already stated earlier, but seeing as you don’t seem to read my posts anyway because I’m an undergrad…

Maybe I should have been clearer about the complexity class involved here, but for posterity sake, as far as we know today, SAT is an NP-hard problem in the general case.

I do hope though you are aware that Discourse saves the editing history of a post, so putting [censored] over what you wrote in the past does not make the words go away (nor is it accurate - we’re not censoring you). What you wrote originally is still accessible, by clicking on the small orange pencil icon in the top right hand corner of your post.

Sukera · March 3, 2023, 11:23pm

No worries! The basic difference between your initial view and mine was which part of this is the Int64:

<-- abstract object ->
<--- yours ---------->
(Type-Tag | Data Bits)
            <- mine ->

That is, I see the abstract object to be distinct from the concrete thing in the language. Kind of like a variable name and the value its bound to are distinct things. Since only the latter is truly accessible inside of the language, thinking about type tags that only exist in an implementation of the language is a dangerous game

This is now veering dangerously close to being off topic, so further discussion should probably be moved to another thread - sorry for derailing, everyone!

Elrod · March 4, 2023, 4:46am

julia> macro checkconst(expr)
           map(expr.args) do ex
               ex isa Symbol || return ex
               (
                   Symbol = ex,
                   IsConst = isconst(__module__, ex),
                   Value = isdefined(__module__, ex) ?
                       getproperty(__module__, ex) :
                       Symbol("#undef#")
               )
           end
       end
@checkconst (macro with 1 method)

julia> const A = 4
4

julia> B = 7
7

julia> @checkconst A + B + C
4-element Vector{NamedTuple{(:Symbol, :IsConst, :Value)}}:
 (Symbol = :+, IsConst = true, Value = +)
 (Symbol = :A, IsConst = true, Value = 4)
 (Symbol = :B, IsConst = false, Value = 7)
 (Symbol = :C, IsConst = false, Value = Symbol("#undef#"))

This obviously isn’t what an actual checkconst macro would look like.
But I don’t think it is unreasonable to use isconst in this way.

Mason · March 4, 2023, 4:56am

That would produce spectacularly incorrect answers in the presence of local scoping.

julia> const A = 1;

julia> let A = 200000000
           @checkconst (A, )
       end
1-element Vector{NamedTuple{(:Symbol, :IsConst, :Value), Tuple{Symbol, Bool, Int64}}}:
 (Symbol = :A, IsConst = 1, Value = 1)

Elrod · March 4, 2023, 4:56am

You’re correct.

uniment · March 4, 2023, 5:18am

Done.

uniment · March 4, 2023, 5:34am

Can you explain what is meant by this?

c42f · March 6, 2023, 5:06am

This was/is something I want to do. But right now there’s no funding to continue working on JuliaSyntax as a job so… yeah it’s unlikely to happen any time soon given how large a project it is to rewrite all of lowering. For now I hope to just land the parser in Base.

uniment · March 6, 2023, 10:33am

[redacted]

That’s too bad. I’m sure you’ve looked into it and have some idea how much effort it would take. How many engineer-months would you estimate (ideally vs. conservatively) it might require? Is it the sort of thing that the Julia community could realistically support, or would it require outside financial support?

uniment · March 6, 2023, 10:34am

Edit: See comment 55 for better draft pseudo-code.

Drafting some rough psuedo-code of the proposal I’ve been imagining for Idea 1, barring any egregious oversight, it appears to be O(n) in the number of expressions in the function body. I’d appreciate a sanity check.

Draft Pseudo-Code of Idea 1:

Define assigns to take an expression and a symbol x. Return true if it or any of its subexpressions (recursive) makes any assignment to x [excluding nested scopes with their own local x]—i.e., check for Expr(:(=), x, ...) and similar assignments, as well special-cased handling of :tuple assignments. Otherwise, return false.

With prior knowledge that a local variable x is captured by an anonymous closure, decide whether to box x by the following procedure (true for box, false for no-box):

Check inside body of closure. If it assigns to x, return true.
Set parent as closure’s parent expression, child as closure expression.
While true: # climb tree, search neighbors for assignments
a. If parent is the body of a for or while loop, and if x is bound to an outer scope, check subexpressions of parent that preceed child. If any assigns to x, return true.
b. If parent is a for or while loop, and if x is bound to an outer scope, check if parent’s condition assigns to x. If so, return true.
c. If parent is the .args[2] of a if or elseif conditional, continue. # next branch is syntactically mutually exclusive.
d. If parent is a tuple expression, check if it assigns to x. If so, return true.
e. Check subexpressions of parent that succeed child. If any assigns to x, return true.
f. If parent is the scope x is bound to, break.
g. Set child=parent and parent to its own parent and continue loop.
Return false.

The basic gist is: if the captured variable has any assignment expression in or after the closure’s declaration, or before it if in a loop, and discounting branches that are syntactically unreachable from the closure’s declaration, then box it—but otherwise don’t. Allowing for identifiers to be rebound unboxed before the closure is declared (and only boxing if identifiers can be rebound after closure declaration) should cause a good number of boxes to go away (such as this, this, this, this, this, this, this, this, this, this, this, this, and these, not to mention the one from Performance Tips as well as most others that I’ve encountered), and barring any mistakes [and excluding @goto], afaict it should be a strict improvement over current behavior.

Notes:

There may be a way to accommodate @goto—initial thoughts suggest it’s still O(n), though it seems likely to require more computation—but I’ll defer those thoughts for later until I gain confidence in this one.
Local named functions also present a unique challenge, but on initial thought they too seem doable. I can think of some degenerate behavior, but it coincides with existing degenerate behavior so introduces no new concern.
Local recursive named functions deserve consideration.

Zach_Christensen · March 6, 2023, 4:37pm

This seems like a natural next step once incorporating compiler passes is fully designed and implemented (which it seems is already pretty far along from my understanding).

Once that’s in place we could play around with alternative optimizations for things like closures more easily.

Rust seems to have done a really good job of providing its own frontend to LLVM so it’s easy to dig in and improve Rust once you know the language. But they don’t concern themselves with compilation time nearly as much as Julia does, so it’s not clear how hard that would be to implement in a reasonable amount of time for Julia.

uniment · March 8, 2023, 3:48am

To make Idea 1 of the OP work with @goto, scrap the above idea of working with the AST and instead work closer to the IR. Seems easier overall (less special cases).

Draft Pseudo-Code of Idea 1 (with goto):

(assumption: all AST has been lowered to flattened IR, but no Core.Boxes have been inserted yet. There might currently be no point during lowering where this is true, but the concept could be adapted.)

With prior knowledge that a local variable x is captured by a closure, decide whether to box x by the following procedure (true for box, false for no-box):

Define assigns(n,x,checked) to take a line number n, a symbol x, and a set of previously explored line numbers checked. assigns operates as follows:

If n ∈ checked, return false.
Push n into checked.
While true:
a. If expression at line n is a Expr(:(=), x, ...), return true.
b. If expression at line n is a :method of a closure known to capture x, push n into checked. Loop over every sub-expression in its CodeInfo’s .code. If any is a Expr(:(=), x, ...), return true.
c. If expression at line n is a Core.GotoNode, return assigns(ex.label,x,checked).
d. If expression at line n is a Core.GotoIfNot, return assigns(n+1,x,checked)||assigns(ex.label,x,checked).
e. If expression at line n is a Core.ReturnNode, or if there are no more expressions, return false.
f. Increment n.

Then, initialize checked to an empty set; let n be the line at which the closure :method is defined; and if assigns(n,x,checked), then box x. If multiple closures are known to capture x, simply call assigns on them without emptying checked to save some computations.

The basic gist is: Explore all expressions that are syntactically reachable in and after the closure’s declaration. If any is an assignment to the captured variable, then box it—but otherwise don’t. Keep a record checked of labels and methods that have been visited, to avoid repeated work and getting stuck in loops. As before:

cc @simeonschaub you seem to have a pretty good grasp of closure capture boxing; do you mind if I entice you to opine?

c42f · March 9, 2023, 4:42am

Hard to estimate because the scope is less clear than you might think: The point of doing this wouldn’t be to just port the existing flisp code to Julia. Because we wouldn’t get much out of that, other than it being a little easier to modify lowering. To really make a difference there’s some rethinking to be done.

Primarily, how do we preserve the detailed source location information provided by JuliaSyntax through macro expansion? This is hard because Expr is the public API for source code in macros (and also the data structure for quoted code). But Expr has no fields for this information. Perhaps there’s a way to add such fields without breaking everything or causing a lot of bloat. But it’s not really clear. Second/related how do we preserve source location through the various lowering passes in a sane way?

It’s at least months several months of effort by someone who’s already a bit familiar with lowering. (shorter for a demo / PoC… but from trying to replace the parser I know there will be a long, long tail of obscure bugs and compatibility things to work out) Also I should point out that I know I’m eternally optimistic and bad at estimating. So to be more realistic I’m just going to say a year’s work :-/

uniment · March 9, 2023, 7:28pm

Once lowering is in Julia, it should be easier to attract labor to help work out larger changes; currently there’s a two-language barrier. Also, is enabling the final solution for #15276 not a worthy goal in its own right?

If we restricted the scope of work to just porting julia-syntax.scm to Julia, how much do you figure that would shorten the schedule? I know, porting/copying is far less gratifying than new creation, but let’s hypothesize we’re willing to swallow that bitter pill for the moment. (I suspect JuliaSyntax’s loftier goals—propagating source trivia past macro expansion—might require waiting for a breaking 2.0 release anyway, but in the meantime having better syntax errors and improving type stability would defo be appreciated.)

c42f · March 10, 2023, 2:33am

I’ve seen it stated somewhere, fairly authoritatively, that this can’t really be solved in lowering (though some aspects could still maybe be improved). I don’t pretend to have thought about closure lowering in depth, though, so I don’t have any strong opinions about this myself.

Also for the record, I was only responding to particular queries where I was mentioned in this thread, I haven’t had the time to read the whole thing.

uniment · March 10, 2023, 6:42am

Hmm, that’s interesting because it seems to stand at odds with this fairly authoritative comment from @tim.holy:

Because a full solution will require some redesign of the lowering stage, I’m guessing it won’t be tackled until the lowering stage is rewritten in Julia . See the *.scm files in src/ .

so, I’m curious to know where’s the disconnect. Tim’s comment matches my intuition, so my instinct is to believe it, despite Idea 3 of my OP being to work around it (supposing that lowering wouldn’t be ported to Julia). Can you offer a source or reference?

GunnarFarneback · March 10, 2023, 6:59am

I have no idea about the technicalities of this problem but I see no disconnect between changes in lowering being required while not being sufficient.

Topic		Replies	Views
Can someone explain closures to me Internals & Design performance , type-stability , closure	13	2986	November 2, 2023
Performance closures with changing value of closed over variables Performance closure	1	383	April 20, 2021
Language restriction to solve 15276? Internals & Design	48	3828	May 11, 2018
`Core.Box` around captured variable General Usage closure , corebox	2	354	December 6, 2022
Help understand boxing Performance	18	891	March 22, 2024

RFC: Some Ideas to Tackle #15276 - performance of captured variables in closures

Related topics