PSA: Reasoning about scope rules and multithreading

Regarding scoping and concurrent programming, allow me to bring up yet another (cautionary) PSA.

tl;dr Julia’s closure is great. But you still have to be careful about assignments.

Consider the following type of code that has no race at the moment:

function bigfunction(...)
    ...
    # very long lines of code
    ...
    @sync for x in xs
        @async begin
            y = f(x)
            g(h(x), y)
        end
    end
end

You might tweak this code later by adding some innocent-looking code outside of the portion using @async:

function bigfunction(...)
    if ...
        y = ...  # added
    end
    ...
    # very long lines of code
    ...
    @sync for x in xs
        @async begin
            y = f(x)
            g(h(x), y)
        end
    end
end

This now introduces a data race because y would be mutated concurrently by multiple tasks. It would be very difficult to catch it by a code review if you have many lines between the newly added code and the task-spawning portion.

Note that it does not matter if you use threading or not for this example. Even if you use @async, you have an incorrect program (unless you really mean to mutate y from different tasks). It’s just that debugging and detecting the bugs is much harder with @spawn.

You can make the above program correct again by using local (or let):

function bigfunction(...)
    if ...
        y = ...
    end
    ...
    # very long lines of code
    ...
    @sync for x in xs
        @async begin
            local y = f(x)
            g(h(x), y)
        end
    end
end

I think it’d be better to warn or throw an error for this type of code when the compiler finds it. Meanwhile, you can avoid this by making @async/@spawn block as small as possible or use let/local always if you need assignments.

12 Likes