Add a value to task local storage if it is not already present

I have a workload that I want to multi thread. This involves calling a function in parallel, that needs some work space for computing. The following rules apply:

  • There may never be two computations using the same workspace at the same time.
  • If one computation is done, its workspace can be reused by another computation.
  • New workspaces can be created at will, but this is an expensive operation.

What is the recommended way to do this? Currently, I use the following pattern, but I don’t like the try catch here:

workspace = try
       task_local_storage(:workspace)
catch err
       task_local_storage(:workspace, create_workspace())
end

Could you have an array of workspaces?

workspaces = [create_workspace() for _ in 1:Threads.nthreads()]

function compute(workspaces, config)
    workspace = workspaces[Threads.threadid()]
    #... Do some compute
end

As long as you use something like Threads.@threads you shouldn’t ever overlap.

You could also use a Channel:

# setup workspaces channel
workspaces = Channel{YourWorkspaceType}(Threads.nthreads())
for i in 1:nthreads()
    put!(workspaces, create_workspace())
end
function compute(workspaces, config)
    workspace = take!(workspaces)
    #... Do some compute
    put!(workspaces, workspace)
end
1 Like

Thanks a lot, these are very interesting ideas!


function compute(workspaces, config)
    workspace = workspaces[Threads.threadid()]
    #... Do some compute
end

This one makes me nervous. AFAIU The scheduler is free to launch new tasks on threads that have unfinished tasks, which means workspace clashes.

# setup workspaces channel
workspaces = Channel{YourWorkspaceType}(Threads.nthreads())
for i in 1:nthreads()
    put!(workspaces, create_workspace())
end
function compute(workspaces, config)
    workspace = take!(workspaces)
    #... Do some compute
    put!(workspaces, workspace)
end

With this, I think there are no workspace clashes, great!
This might deplete the channel if the scheduler decides to use more tasks than threads. But that may even be desirable. In contrast to my solution, this ensures no more workspaces than strictly necessary are created.

1 Like