PSA: Reasoning about scope rules and multithreading

Perhaps this has been discussed before and I missed it, but during my first experimentation with multithreading, I suddenly realized that Julia’s scoping rules in loops, which a number of people find confusing (including myself when I first learnt about them), are really convenient in that context.

One of the things that made me wary of taking advantage of multithreading before, was fear of screwing up the values of the variables when running the iterations in god-knows-which-order (I had not learnt the meaning of “race condition” until recently, but I had an intuition of the concept). And when I finally decided to go for it in Julia, I was conforted to realize that each iteration creates its own scope, so that I only had to take care of the variables that have been defined outside the loop: I can trust that variables that are exclusively defined inside the loop can’t be “touched” by any other iteration that runs in parallel.

@StefanKarpinski said once that the main motivation for Julia’s scoping rules were closures. But even if this advantage on multithreading is only a nice side effect, I think that it is worth to mention it. Probably nobody will care about either closures or multithreading on their first day with Julia, but the benefit of keeping variables apart between iterations in threaded loops is easy to explain and understand - in my opinion even easier than the advantages that Julia’s scope rules have for creating closures.

12 Likes

Regarding scoping and concurrent programming, allow me to bring up yet another (cautionary) PSA.

tl;dr Julia’s closure is great. But you still have to be careful about assignments.

Consider the following type of code that has no race at the moment:

function bigfunction(...)
    ...
    # very long lines of code
    ...
    @sync for x in xs
        @async begin
            y = f(x)
            g(h(x), y)
        end
    end
end

You might tweak this code later by adding some innocent-looking code outside of the portion using @async:

function bigfunction(...)
    if ...
        y = ...  # added
    end
    ...
    # very long lines of code
    ...
    @sync for x in xs
        @async begin
            y = f(x)
            g(h(x), y)
        end
    end
end

This now introduces a data race because y would be mutated concurrently by multiple tasks. It would be very difficult to catch it by a code review if you have many lines between the newly added code and the task-spawning portion.

Note that it does not matter if you use threading or not for this example. Even if you use @async, you have an incorrect program (unless you really mean to mutate y from different tasks). It’s just that debugging and detecting the bugs is much harder with @spawn.

You can make the above program correct again by using local (or let):

function bigfunction(...)
    if ...
        y = ...
    end
    ...
    # very long lines of code
    ...
    @sync for x in xs
        @async begin
            local y = f(x)
            g(h(x), y)
        end
    end
end

I think it’d be better to warn or throw an error for this type of code when the compiler finds it. Meanwhile, you can avoid this by making @async/@spawn block as small as possible or use let/local always if you need assignments.

12 Likes

This was actually one of the motivations for how the scope rules were designed. Early on we were even considering automatic parallelization of comprehensions (we still might at some point!) or even for loops. We didn’t end up doing that, but if you’re going to even be able to consider it, you really want to avoid creating spurious variable dependencies between iterations. This dictates that loop iterations have their own scope so that locals assigned don’t inadvertently spill out and require synchronization, and it also dictates that each iteration has its own separate locals, rather than reusing them, since otherwise the value of a local from a previous iteration is visible from a later one, creating a temporal dependency, which would prevent parallelization.

3 Likes

As a note, this is captured in Race condition caused by variable scope getting lifted from a multithreaded context · Issue #14948 · JuliaLang/julia · GitHub.

3 Likes