Simple example of @threads

amrods · June 20, 2022, 6:42am

How can I make this simple example thread safe?

function func(R=5)
    s = 0.0
    Threads.@threads :static for i in 1:R
        s += 1
    end
    return s
end

If it makes any difference, in my application I am summing over a very expensive function, instead of just integers.

gdalle · June 20, 2022, 7:11am

Maybe try ThreadsX.sum?

amrods · June 20, 2022, 8:29am

I’d like to keep the structure of a for loop because I have something like this:

function func(R)
    b = zero(paramvec)
    for i in 1:R
        sdata = datasimulator() # returns a vector of data
        b += estimatemodel(sdata) # returns a vector of parameters
    end
    avgb = b ./ R # averages parameters across simulations
    return likelihood(avgb)
end

Can that be done?

gdalle · June 20, 2022, 9:10am

I think it can be achieved with ThreadsX.mapreduce

lawless-m · June 20, 2022, 10:27am

If your function is more expensive than summing floats then this will do the job.

(You said summing integers but your s = 0.0 is a float)

function func(R=5)
    s = Vector{Float64}(undef, R)
    Threads.@threads :static for i in 1:R
        s[i] = 1 # expensive 1
    end
    return sum(s)
end

amrods · June 21, 2022, 11:30am

I see Folds.jl is a generalisation of ThreadsX.jl, can I use multiprocessing and multithreads simultaneously?

carstenbauer · June 21, 2022, 11:31am

Note though, that this can lead to false sharing.

amrods · June 21, 2022, 11:46am

I find it strange that if I start julia with -t auto, getting 4 threads, I obtain the same result as with 1 thread, but if I set -t 6 I get another result. What could be going on? how do I know that other results with auto would be correct?

pitsianis · June 21, 2022, 1:08pm

This is a data race. It can happen with two threads as well!

Here is a possible scenario

A:                 B:
read s             read s
s++                s++
write s
                   write s

PeterSimon · June 21, 2022, 9:12pm

I suggest these changes to avoid races:

function func(R)
    b = [zero(paramvec) for _ in 1:Threads.nthreads()] # Don't use zeros(paramvec)
    Threads.@threads :static for i in 1:R
        sdata = datasimulator() # returns a vector of data
        b[Threads.threadid()] += estimatemodel(sdata) # returns a vector of parameters
    end
    bsum = sum(b)
    avgb = bsum ./ R # averages parameters across simulations
    return likelihood(avgb)
end

gbaraldi · June 21, 2022, 10:19pm

You are looking for something like Floops.jl, which hides a lot of the boilerplate. Another thing, you are missing a dot so you allocate on each iteration.

julia> using FLoops

julia> const test = rand(1000)

julia> function func(R)
           b = zero(rand(1000))
           for i in 1:R
              b .+= test # returns a vector of parameters
           end
           avgb = b ./ R # averages parameters across simulations
           return avgb
       end
func (generic function with 1 method)

julia> function func2(R)
           b = zero(rand(1000))
           @floop for i in 1:R
              @reduce b .+= test # returns a vector of parameters
           end
           avgb = b ./ R # averages parameters across simulations
           return avgb
       end
func2 (generic function with 1 method)

julia> @btime func(1000);
  129.000 μs (3 allocations: 23.81 KiB)

julia> @btime func2(1000);
  39.667 μs (35 allocations: 57.33 KiB)

Elrod · June 22, 2022, 12:17am

Maybe there’s a more efficient way to do that.

amrods · June 22, 2022, 12:40am

In summary, what’s best to use between FLoops, Folds, ThreadsX? in terms of ease of use, for a non-computer scientist? (is that a fair comparison?)

gbaraldi · June 22, 2022, 2:36am

I wanted to keep it as similar as the original

gbaraldi · June 22, 2022, 2:37am

Whatever you find easier to use is the best.

amrods · June 22, 2022, 5:56am

If I want to make use of my university computer cluster, I suppose it would be better to use multi-processing, rather than threaded execution, no?

carstenbauer · June 22, 2022, 5:58am

If you want to run a computation across multiple machines, then yes.

amrods · June 22, 2022, 6:11am

Ok, what’s the easiest way of doing that? Would this work?

In a file named multiproc.jl, write

using Distributed
using Folds

@everywhere include("main.jl")

where main.jl is the original file that I want to run (and which runs fine serially). That main.jl file calls other .jl files, including the one in which I perform the simulations in my post above, now modified to use Folds as follows:

function func(R, params)
        function sdatamodel(i)
            sdata = simulate_data()
        return auxiliarymodel(sdata)
    end

    avgb = Folds.mapreduce(sdatamodel, +, 1:R, DistributedEx(); init=zero(params)) ./ R
    return likelihood(avgb)
end

After that, I perform some other operations, like optimizing over some parameters which necessarily is a serial operation.
2. Run multiproc.jl in the computer cluster with julia -p 100 multiproc.jl.

Topic		Replies	Views
What's the most efficient way to sum the results of multithreading? Performance multithreading , threads	3	1256	May 28, 2021
Multithreaded for loops in Julia Julia at Scale question	11	210	July 15, 2025
Looking for advice on threading General Usage	6	1622	January 15, 2020
Trying to use map function in Threads General Usage question , multithreading , distributed	6	880	September 1, 2022
How to lock variables in @threads General Usage question , multithreading	11	2060	March 24, 2021

Simple example of @threads

Related topics