Idiomatic way to avoid the closure capture bug?

I’ve been looking for ways to generate tuples without falling prey to the infamous closure capture bug. I’ve found essentially four different ways, and wanted to know if it makes a difference which one I use, or whether any is the idiomatic way to do it.

Suppose you want to do ntuple(i -> i * par, Val(N)). Then you could either

1

_par = par
return ntuple(i -> i * _par, Val(N))

2

f = Base.Fix2((i, _par) -> i * _par, par)
return ntuple(f, Val(N))

3

f = let par = par
    i -> i * par
end
return ntuple(f, Val(N))

4

let par = par
    return ntuple(i -> i * par, Val(N))
end

I think 1 and 4 are most canonical

1 Like

4 would be my preferred way to do it, but all the options you show here are going to be the same, other than how much or little they end up cluttering your namespace. 4. clutters the namespace the least, so that’s why I prefer it.

Sometimes I use this macro to make it a little cleaner:

"""
   @localize args... expr

Writing
```
@localize x y z expr
```
is equivalent to writing
```
let x=x, y=y, z=z
    expr
end
```
This is useful for avoiding the boxing of captured variables when working with closures.

See https://juliafolds2.github.io/OhMyThreads.jl/stable/literate/boxing/boxing/ for more information about boxed variables.
"""
macro localize(args...)
    syms = args[1:end-1]
    ex = args[end]
    letargs = map(syms) do sym
        if !(sym isa Symbol)
            throw(ArgumentError("All but the final argument to `@localize` must be symbols! Got $sym"))
        end
        :($sym = $sym)
    end
    esc(:(let $(letargs...)
              $ex
          end))
end

With that, you could write

@localize par ntuple(i -> i * par, Val(N))
5 Likes

Well, actually this statement is technically not quite true. Option 2 does actually carry the potential for a performance difference. The Fix2 object only specializes on the type of par, not the value of par, so using it can block constant propagation of par’s value in some circumstances.


Here’s an example where an anonymous function results in more efficient code than Fix2:

foo(f,a,b) = f(1) ? a / b : b / a;
greater_than_1(x) = x > 1;

This is efficient:

julia> @code_llvm debuginfo=:none foo(greater_than_1, 1, 2)
; Function Signature: foo(typeof(Main.greater_than_1), Int64, Int64)
define double @julia_foo_4589(i64 signext %"a::Int64", i64 signext %"b::Int64") #0 {
top:
  %0 = sitofp i64 %"b::Int64" to double
  %1 = sitofp i64 %"a::Int64" to double
  %2 = fdiv double %0, %1
  ret double %2
}

This is less efficient:

julia> @code_llvm debuginfo=:none foo(Base.Fix2(>, 1), 1, 2)
; Function Signature: foo(Base.Fix{2, typeof(Base.:(>)), Int64}, Int64, Int64)
define double @julia_foo_4592(ptr nocapture noundef nonnull readonly align 8 dereferenceable(8) %"f::Fix", i64 signext %"a::Int64", i64 signext %"b::Int64") #0 {
top:
  %"f::Fix.unbox" = load i64, ptr %"f::Fix", align 8
  %0 = icmp sgt i64 %"f::Fix.unbox", 0
  br i1 %0, label %L8, label %L4

common.ret:                                       ; preds = %L8, %L4
  %common.ret.op = phi double [ %3, %L4 ], [ %6, %L8 ]
  ret double %common.ret.op

L4:                                               ; preds = %top
  %1 = sitofp i64 %"a::Int64" to double
  %2 = sitofp i64 %"b::Int64" to double
  %3 = fdiv double %1, %2
  br label %common.ret

L8:                                               ; preds = %top
  %4 = sitofp i64 %"b::Int64" to double
  %5 = sitofp i64 %"a::Int64" to double
  %6 = fdiv double %4, %5
  br label %common.ret
}

2 Likes

Instead of (i, _par) -> i * _par, just write *. Then the entire example 2 reduces to just:

return ntuple(Base.Fix2(*, par), Val(N))
2 Likes

You’ll almost always avoid it if you keep to simpler variable lifetimes and assignments. I’ve found it makes my code more readable for humans, too!

E.g., instead of assigning the same name many times inside lots of different if branches, refactor to functions or expressions that return or evaluate to the value(s) you want and assign the name once.

6 Likes

While this is something to be aware of, I will clarify that this does not mean that Base.Fix is less efficient than closures. It is just a gotcha to keep in mind, both while programming and while checking performance. And the gotcha is not specific to Base.Fix, rather it is a more general property of Julia. Programmers who care about performance in Julia simply must be aware of types in their code.

Alternatively, taking your post as an argument against Base.Fix, which I am not saying it is, but hypothetically taking it as such, the argument would be a straw man.

A fair comparison leads to identical code_llvm:

julia> @code_llvm debuginfo=:none ((x, y) -> foo(greater_than_1, x, y))(1, 2)
; Function Signature: var"#2"(Int64, Int64)
define double @"julia_#2_0"(i64 signext %"x::Int64", i64 signext %"y::Int64") local_unnamed_addr #0 {
top:
  %0 = sitofp i64 %"y::Int64" to double
  %1 = sitofp i64 %"x::Int64" to double
  %2 = fdiv double %0, %1
  ret double %2
}

julia> @code_llvm debuginfo=:none ((x, y) -> foo(Base.Fix2(>, 1), x, y))(1, 2)
; Function Signature: var"#5"(Int64, Int64)
define double @"julia_#5_0"(i64 signext %"x::Int64", i64 signext %"y::Int64") local_unnamed_addr #0 {
top:
  %0 = sitofp i64 %"y::Int64" to double
  %1 = sitofp i64 %"x::Int64" to double
  %2 = fdiv double %0, %1
  ret double %2
}

I now tend to believe that let block and the local keyword are redundant constructs. You can always write a separate function (called function barrier) to introduce local variable.

So I think that closure capture bug is rendered less important.

Agreed. I was just clarifying because @araujoms asked if there was a difference between the approaches, and from that PoV, option 2 is the only one that has any potential differences in codegen, whereas the others are just stylistic changes.

I agree though that the differences with Fix1/Fix2/Returns/… are pretty minor though, and in the overwhelming majority of cases are not going to have any appreciable differences in codegen.

But it’s something to be aware of whenever passing around structs that wrap data, rather than embedding that data in a function body directly, because if the struct ever gets passed through a function barrier where it fails to inline, then there’s more cases where you can get under-specialized code.

2 Likes

Thanks for all the answers! I’m glad it doesn’t really make a difference.

2 is the way recommended in the SciML style guide, and it’s most inconvenient to use (specially if there are several parameters to fix).

3 is the way recommended in the Julia manual. While in this simple example it’s quite ugly, I think it makes sense when you need to build a complicated function.

Otherwise I think 4 is indeed the best.

The Julia manual only writes it like 3 because the closure needed to be returned, not immediately called by a higher-order function. Moving the higher-order function call, ntuple here, into the let block like 4 is typical, if not moreso.

2 is preferred by some because it doesn’t involve a scope capturing variables at all, just a structure containing a value as the writer may intend. While these small examples are all equivalent for its purpose, further adding an assignment inside the let block or another closure to do something else ruins the no-reassignment optimization, often by accident. I ran into this in Base before, it ended up getting fixed by 1 to avoid layering more blocks. 2 isn’t done more because Fix{N}-ing one positional argument is often not feasible and warrants custom functors.

2 Likes

:+1:
For this it would be nice if the const keyword were allowed in local scope to indicate that “I really don’t want to reassign this variable.” Unfortunately, after more than 10 years, there’s still no agreement about the implementation details.

2 Likes

What’s the no-reassignment optimization?

All of the approaches except Fix involve creating a new variable that is only assigned once during or after declaration, never reassigned. That lets the lowerer implement the closure as a struct instance with the captured variable represented as a parametric immutable field. I called it an optimization because otherwise, it becomes an untyped mutable field to store reassignments, which could occur in other closures with no explicit types. A common intuition is that inferable closures ought to help, but like any other function, different input types would result in different inferred types, and the captured variable needs to accommodate all possibilities. There are possible implementation improvements with regards to prior reassignments in the same scope, calling closures that don’t escape, or explicit types.

Global const reassignments being possible across world ages now makes the keyword less suitable for catching unintended reassignments. Still think it’d have been really nice if there was a consistent and required syntax for variable declaration, maybe even definitions, that can’t occur twice in the same scope; that more traditional approach lets accumulating for-loops look the same in functions versus the global scope for instance.

1 Like