What is the correct way to use Base.OncePerThread?

I want to use Base.OncePerThread but it’s not clear what the intended usage is.

Is the following a correct / safe / intended way to use Base.OncePerThread ?

finalset = Set()
tempsets = OncePerThread() do
    return Set()
end
Threads.@threads for i in 1:100
    push!(tempsets(), rand("qwe", 3) |> String)
end
usedtempsets = @view tempsets.xs[tempsets.ss .== 0x01]
union!(finalset, usedtempsets...)
finalset

If this is intended why is it such a pain to get the per-thread values out again?

Just tempsets.xs[tempsets.ss .== 0x01] without the @view errors and OncePerThread doesn’t support iterate so am I not supposed to access the memory after the threads are done with it?

Contrary to what the name of the macro implies, @threads does not launch OS threads (which OncePerThread is referring to), but tasks (which is what OncePerTask is about). I believe OncePerThread is mostly intended for doing per-OS-thread initialization of third party C libraries that assume OS threads as the means of concurrency.

2 Likes

I think most of the pain here stems from the fact that Julia’s parallelism is really focused on Tasks not on threads.

That being said: using OhMyThreads.jl you can write:

using OhMyThreads: tmap, chunks

tmpsets = tmap(chunks(1:100)) do chunk
    tmpset = Set()
    for i in chunk
        push!(tempset, rand("qwe", 3) |> String)
    end
    return tmpset
end
reduce(union!, tmpsets)

Which reads and feels much clearer to me.

Indeed, the reason I’m using OncePerThread in the example is because Set is not thread safe and I don’t want to use locks since I don’t need to.

So all OncePerThread is supposed to be doing in the example is allocating memory that each thread “owns” and then doing clean up after they are done by collecting all their work.

That example doesn’t seem to work
chunks complains that it needs a number of chunks and then tmap gives me an

UndefVarError: `threadpool` not defined in `OhMyThreads.Implementation`

I’m using OhMyThreads v0.8.3