Will `@info` corrupt `Threads.@threads for`?

My configuration is

julia> Threads.nthreads()
4

julia> Threads.nthreads(:default)
4

I have a function

function parallel_CG!(B, θ, β, μ, ν)
    for ite = ...
        ...
        for ba = ...
            sub_j_vec = ...
            println("before entering @threads, sub_j_vec = $sub_j_vec")
            Threads.@threads for i = eachindex(sub_j_vec)
                j = sub_j_vec[i]
                # @info "inside @threads" i j
                println("inside @threads, i = $i, j = $j")
            end
            error("No.1 can you see this error??")
        end
    end
end

when I call it, I get the expected result as follows

julia> parallel_CG!(B, θ, β, μ, ν)
before entering @threads, sub_j_vec = [1, 4, 2, 3]
inside @threads, i = 1, j = 1
inside @threads, i = 3, j = 2
inside @threads, i = 2, j = 4
inside @threads, i = 4, j = 3
ERROR: No.1 can you see this error??

However, if I delete the # symbol in the definition of the function, and re-run the function, I get this corrupted result

julia> parallel_CG!(B, θ, β, μ, ν)
before entering @threads, sub_j_vec = [1, 4, 2, 3]
┌ Info: inside @threads
│   i = 1
└   j = 2
┌ Info: inside @threads
│   i = 4
└   j = 2
┌ Info: inside @threads
│   i = 2
└   j = 4
inside @threads, i = 1, j = 4
inside @threads, i = 4, j = 4
inside @threads, i = 2, j = 4
┌ Info: inside @threads
│   i = 3
└   j = 4
inside @threads, i = 3, j = 4
ERROR: No.1 can you see this error??

My question is—why?

details see Julia crashes without reporting anything when I optimize a vector of models in parallel - #5 by WalterMadelim

Looks like there’s a variable named j in some surrounding scope. As a result, the binding j is captured and shared among all tasks. Renaming the inner j to something obviously unique, like _j_, should fix the problem (you’ll probably want to find a better name, this is just for demonstration purposes). Alternatively, adding local j in the inner scope should also work.

In other words, Julia’s default scoping rules correspond to using the nonlocal keyword in Python.

One way to avoid this pitfall is to factor out nested local scopes into separate functions. It could look something like this:

function inner_loop_body(sub_j_vec, i)
    j = sub_j_vec[i]
    # @info "inside @threads" i j
    println("inside @threads, i = $i, j = $j")
end

function parallel_CG!(B, θ, β, μ, ν)
    for ite = ...
        ...
        for ba = ...
            sub_j_vec = ...
            println("before entering @threads, sub_j_vec = $sub_j_vec")
            Threads.@threads for i = eachindex(sub_j_vec)
                inner_loop_body(i, sub_j_vec)
            end
            error("No.1 can you see this error??")
        end
    end
end

This way, it’s impossible for the inner j to shadow an outer one.

2 Likes

Looking at the full code from the other thread, the shadowed j is here:

            Threads.@threads for i = eachindex(sub_j_vec)
                j = sub_j_vec[i]
                ...
            end
            ...
            if Δ > COT
                j = sub_j_vec[ii]
                ...

It may look surprising that the binding of j would be shared between these two locations, since they’re in different blocks at the same level of nesting, and there’s no reference to j in any shared parent block. However, if does not introduce a new scope, so by assigning to j inside the if, you’re introducing a binding j that belongs to the surrounding scope, which is also the parent scope of the @threads loop. Hence, every time the tasks spawned by @threads assign to j, they’re all assigning to the shared binding inherited from the parent scope, rather than to their own local js.

2 Likes

Yes, I can reproduce

julia> v = [2, 3, 4, 1];

julia> for _ = 1:1
           Threads.@threads for i = eachindex(v)
               j = v[i]
               @info "see" i j
           end
       end
┌ Info: see
│   i = 3
└   j = 4
┌ Info: see
│   i = 4
└   j = 1
┌ Info: see
│   i = 2
└   j = 3
┌ Info: see
│   i = 1
└   j = 2

julia> for _ = 1:1
           Threads.@threads for i = eachindex(v)
               j = v[i]
               @info "see" i j
           end
           j = 9
       end
┌ Info: see
│   i = 4
└   j = 1
┌ Info: see
│   i = 1
└   j = 1
┌ Info: see
│   i = 2
└   j = 1
┌ Info: see
│   i = 3
└   j = 1

Therefore, the scope rule of Threads.@threads for is the same as the ordinary for?
(The former is a bit subtle, because it is not a loop, strictly speaking, e.g. it doesn’t allow a break.)

I took a closer look with the aid of @macroexpand. It seems that the Threads.@threads will delegate the total work i = eachindex(v) to smaller blocks, according to how many physical threads are present. Therefore, indeed,

  • there is an ordinary for loop upon expanding. Hence, the scope rule is indeed similar.

It is indeed a loop, and it allows a break, but the break will halt the loop in only the task which executes it. If you need all tasks to stop you must do it yourself, e.g. with a check:

stopit = Threads.Atomic{Bool}(false)
Threads.@threads for i in 1:N
    stopit[] && break
    local status = work(i)
    status == 0 && (stopit[] = true)
end
1 Like

One difference is that these for loops are inside the closures created by the tasks, so there’s an extra layer of capturing bindings. However, closure capture by design works out the same as any other local scope (much to the dismay of many who’ve seen their performance tanked by the infamous issue #15276).