RFC: Some Ideas to Tackle #15276 - performance of captured variables in closures

Edit: See comment 55 for better draft pseudo-code.

Drafting some rough psuedo-code of the proposal I’ve been imagining for Idea 1, barring any egregious oversight, it appears to be O(n) in the number of expressions in the function body. I’d appreciate a sanity check.

Draft Pseudo-Code of Idea 1:

Define assigns to take an expression and a symbol x. Return true if it or any of its subexpressions (recursive) makes any assignment to x [excluding nested scopes with their own local x]—i.e., check for Expr(:(=), x, ...) and similar assignments, as well special-cased handling of :tuple assignments. Otherwise, return false.

With prior knowledge that a local variable x is captured by an anonymous closure, decide whether to box x by the following procedure (true for box, false for no-box):

  1. Check inside body of closure. If it assigns to x, return true.
  2. Set parent as closure’s parent expression, child as closure expression.
  3. While true: # climb tree, search neighbors for assignments
    a. If parent is the body of a for or while loop, and if x is bound to an outer scope, check subexpressions of parent that preceed child. If any assigns to x, return true.
    b. If parent is a for or while loop, and if x is bound to an outer scope, check if parent’s condition assigns to x. If so, return true.
    c. If parent is the .args[2] of a if or elseif conditional, continue. # next branch is syntactically mutually exclusive.
    d. If parent is a tuple expression, check if it assigns to x. If so, return true.
    e. Check subexpressions of parent that succeed child. If any assigns to x, return true.
    f. If parent is the scope x is bound to, break.
    g. Set child=parent and parent to its own parent and continue loop.
  4. Return false.

The basic gist is: if the captured variable has any assignment expression in or after the closure’s declaration, or before it if in a loop, and discounting branches that are syntactically unreachable from the closure’s declaration, then box it—but otherwise don’t. Allowing for identifiers to be rebound unboxed before the closure is declared (and only boxing if identifiers can be rebound after closure declaration) should cause a good number of boxes to go away (such as this, this, this, this, this, this, this, this, this, this, this, this, and these, not to mention the one from Performance Tips as well as most others that I’ve encountered), and barring any mistakes [and excluding @goto], afaict it should be a strict improvement over current behavior.

Notes:

  • There may be a way to accommodate @goto—initial thoughts suggest it’s still O(n), though it seems likely to require more computation—but I’ll defer those thoughts for later until I gain confidence in this one.
  • Local named functions also present a unique challenge, but on initial thought they too seem doable. I can think of some degenerate behavior, but it coincides with existing degenerate behavior so introduces no new concern.
  • Local recursive named functions deserve consideration.