Multithreading, preallocation of objects, threadid() and task migration

tkf · September 3, 2021, 7:01pm

For example, consider:

function do_stuff!(matrix, x)
    matrix[1, 1] = a = rand()
    yield()  # or `@show`, or `@debug`, or write to a file, ...
    matrix[1, 1] += x
    @assert matrix[1, 1] == a + x  # this *can* fail
end

The Julia runtime may decide to suspend a task executing do_stuff!(matrices[1], x) when it hits yield and starts a new task at thread 1, executing do_stuff!(matrices[1], x) again but with different x. Then, by the time the original task is restarted, matrix[1, 1] has a completely different value.

(For similar discussion, see Is `@async` memory safe? - #8 by tkf. This is about @async but task migration occurs at yield points and so it explains why matrices[Threads.threadid()] style does not work in general.)

So, the only way to use the matrices[Threads.threadid()] style is to make sure that there is no I/O (yield points) in all code reachable from do_stuff!. Sometimes it’s doable, but it does not scale well. In particular, do_stuff! cannot be generic over its argument anymore since you don’t know what methods of the functions are going to be used. This is why I’m against matrices[Threads.threadid()] style.

Yeah, like putting FLoops in stdlib?