Idiomatic way to avoid the closure capture bug?

I’ve been looking for ways to generate tuples without falling prey to the infamous closure capture bug. I’ve found essentially four different ways, and wanted to know if it makes a difference which one I use, or whether any is the idiomatic way to do it.

Suppose you want to do ntuple(i -> i * par, Val(N)). Then you could either

1

_par = par
return ntuple(i -> i * _par, Val(N))

2

f = Base.Fix2((i, _par) -> i * _par, par)
return ntuple(f, Val(N))

3

f = let par = par
    i -> i * par
end
return ntuple(f, Val(N))

4

let par = par
    return ntuple(i -> i * par, Val(N))
end

I think 1 and 4 are most canonical

1 Like

4 would be my preferred way to do it, but all the options you show here are going to be the same, other than how much or little they end up cluttering your namespace. 4. clutters the namespace the least, so that’s why I prefer it.

Sometimes I use this macro to make it a little cleaner:

"""
   @localize args... expr

Writing
```
@localize x y z expr
```
is equivalent to writing
```
let x=x, y=y, z=z
    expr
end
```
This is useful for avoiding the boxing of captured variables when working with closures.

See https://juliafolds2.github.io/OhMyThreads.jl/stable/literate/boxing/boxing/ for more information about boxed variables.
"""
macro localize(args...)
    syms = args[1:end-1]
    ex = args[end]
    letargs = map(syms) do sym
        if !(sym isa Symbol)
            throw(ArgumentError("All but the final argument to `@localize` must be symbols! Got $sym"))
        end
        :($sym = $sym)
    end
    esc(:(let $(letargs...)
              $ex
          end))
end

With that, you could write

@localize par ntuple(i -> i * par, Val(N))

Well, actually this statement is technically not quite true. Option 2 does actually carry the potential for a performance difference. The Fix2 object only specializes on the type of par, not the value of par, so using it can block constant propagation of par’s value in some circumstances.


Here’s an example where an anonymous function results in more efficient code than Fix2:

foo(f,a,b) = f(1) ? a / b : b / a;
greater_than_1(x) = x > 1;

This is efficient:

julia> @code_llvm debuginfo=:none foo(greater_than_1, 1, 2)
; Function Signature: foo(typeof(Main.greater_than_1), Int64, Int64)
define double @julia_foo_4589(i64 signext %"a::Int64", i64 signext %"b::Int64") #0 {
top:
  %0 = sitofp i64 %"b::Int64" to double
  %1 = sitofp i64 %"a::Int64" to double
  %2 = fdiv double %0, %1
  ret double %2
}

This is less efficient:

julia> @code_llvm debuginfo=:none foo(Base.Fix2(>, 1), 1, 2)
; Function Signature: foo(Base.Fix{2, typeof(Base.:(>)), Int64}, Int64, Int64)
define double @julia_foo_4592(ptr nocapture noundef nonnull readonly align 8 dereferenceable(8) %"f::Fix", i64 signext %"a::Int64", i64 signext %"b::Int64") #0 {
top:
  %"f::Fix.unbox" = load i64, ptr %"f::Fix", align 8
  %0 = icmp sgt i64 %"f::Fix.unbox", 0
  br i1 %0, label %L8, label %L4

common.ret:                                       ; preds = %L8, %L4
  %common.ret.op = phi double [ %3, %L4 ], [ %6, %L8 ]
  ret double %common.ret.op

L4:                                               ; preds = %top
  %1 = sitofp i64 %"a::Int64" to double
  %2 = sitofp i64 %"b::Int64" to double
  %3 = fdiv double %1, %2
  br label %common.ret

L8:                                               ; preds = %top
  %4 = sitofp i64 %"b::Int64" to double
  %5 = sitofp i64 %"a::Int64" to double
  %6 = fdiv double %4, %5
  br label %common.ret
}

Instead of (i, _par) -> i * _par, just write *. Then the entire example 2 reduces to just:

return ntuple(Base.Fix2(*, par), Val(N))

You’ll almost always avoid it if you keep to simpler variable lifetimes and assignments. I’ve found it makes my code more readable for humans, too!

E.g., instead of assigning the same name many times inside lots of different if branches, refactor to functions or expressions that return or evaluate to the value(s) you want and assign the name once.

2 Likes

While this is something to be aware of, I will clarify that this does not mean that Base.Fix is less efficient than closures. It is just a gotcha to keep in mind, both while programming and while checking performance. And the gotcha is not specific to Base.Fix, rather it is a more general property of Julia. Programmers who care about performance in Julia simply must be aware of types in their code.

Alternatively, taking your post as an argument against Base.Fix, which I am not saying it is, but hypothetically taking it as such, the argument would be a straw man.

A fair comparison leads to identical code_llvm:

julia> @code_llvm debuginfo=:none ((x, y) -> foo(greater_than_1, x, y))(1, 2)
; Function Signature: var"#2"(Int64, Int64)
define double @"julia_#2_0"(i64 signext %"x::Int64", i64 signext %"y::Int64") local_unnamed_addr #0 {
top:
  %0 = sitofp i64 %"y::Int64" to double
  %1 = sitofp i64 %"x::Int64" to double
  %2 = fdiv double %0, %1
  ret double %2
}

julia> @code_llvm debuginfo=:none ((x, y) -> foo(Base.Fix2(>, 1), x, y))(1, 2)
; Function Signature: var"#5"(Int64, Int64)
define double @"julia_#5_0"(i64 signext %"x::Int64", i64 signext %"y::Int64") local_unnamed_addr #0 {
top:
  %0 = sitofp i64 %"y::Int64" to double
  %1 = sitofp i64 %"x::Int64" to double
  %2 = fdiv double %0, %1
  ret double %2
}

I now tend to believe that let block and the local keyword are redundant constructs. You can always write a separate function (called function barrier) to introduce local variable.

So I think that closure capture bug is rendered less important.

Agreed. I was just clarifying because @araujoms asked if there was a difference between the approaches, and from that PoV, option 2 is the only one that has any potential differences in codegen, whereas the others are just stylistic changes.

I agree though that the differences with Fix1/Fix2/Returns/… are pretty minor though, and in the overwhelming majority of cases are not going to have any appreciable differences in codegen.

But it’s something to be aware of whenever passing around structs that wrap data, rather than embedding that data in a function body directly, because if the struct ever gets passed through a function barrier where it fails to inline, then there’s more cases where you can get under-specialized code.

1 Like