Type-instability because of @threads boxing variables

Yes.
You could try adding let blocks.
Note that Threads.@spawn has sugar where using $ on variables automatically creates let blocks for you:

julia> @macroexpand Threads.@spawn foo($x)
quote
    #= threadingconstructs.jl:229 =#
    let var"##394" = x
        #= threadingconstructs.jl:230 =#
        local var"#14#task" = Base.Threads.Task((()->begin
                            #= threadingconstructs.jl:226 =#
                            foo(var"##394")
                        end))
        #= threadingconstructs.jl:231 =#
        (var"#14#task").sticky = false
        #= threadingconstructs.jl:232 =#
        if $(Expr(:islocal, Symbol("##sync#48")))
            #= threadingconstructs.jl:233 =#
            Base.Threads.put!(var"##sync#48", var"#14#task")
        end
        #= threadingconstructs.jl:235 =#
        Base.Threads.schedule(var"#14#task")
        #= threadingconstructs.jl:236 =#
        var"#14#task"
    end
end

But this isn’t supported for Threads.@threads, although it’s probably easy enough to manually create let blocks, unless you have an absurd number of variables (fun fact: let used to be O(N^2) in number of variables, until very recent versions of Julia [1.8+?], where it is now O(N), thanks to some code with thousands of let causing this to actually be problematic). This is the classic workaround.

You could also manually chunk your iteration range, perhaps separating the code that iterates over a chunk into a separate function, and use @spawn. I also often do this anyway when I want more control over task-local state, e.g. rngs. @spawn is of course more dynamic than @threads, which you may or may not want.

2 Likes

I’m actually using @spawn in my code. Thanks for the tip.

I think that will be the best workaround here.

Regardless of whether or not there are performance implications, I would still strongly prefer if mutating bindings.

julia> @testset "Closure" begin
       function foo(x)
         z = 2x
         return 3z
       end
       z = 3
       @test foo(z) == 18
       @test z == 3
       end
Closure: Test Failed at REPL[6]:8
  Expression: z == 3
   Evaluated: 6 == 3
Stacktrace:
 [1] macro expansion
   @ ~/Documents/languages/julia/usr/share/julia/stdlib/v1.9/Test/src/Test.jl:464 [inlined]
 [2] macro expansion
   @ REPL[6]:8 [inlined]
 [3] macro expansion
   @ ~/Documents/languages/julia/usr/share/julia/stdlib/v1.9/Test/src/Test.jl:1360 [inlined]
 [4] top-level scope
   @ REPL[6]:2
Test Summary: | Pass  Fail  Total  Time
Closure       |    1     1      2  0.3s
ERROR: Some tests did not pass: 1 passed, 1 failed, 0 errored, 0 broken.

This is just confusing.
If you encounter this bug in the wild – without knowing to look for it – you might try copy/pasting it into the REPL, only to find out that the tests pass / you can’t reproduce it.
Spooky heisenbugs at a distance.

When someone actually does want this behavior, it’d be much better to make it explicit via capturing and mutating a RefValue or some other mutable struct.

Still, would be nice for the performance gets fixed, too.

2 Likes

Very interesting. My workflow regarding common performance pitfalls is usually asking a profiler and/or JET.jl for advice. Do you think the compiler itself should warn us somehow (this would resemble clang sanitizers)?