Simple example of @threads

How can I make this simple example thread safe?

function func(R=5)
    s = 0.0
    Threads.@threads :static for i in 1:R
        s += 1
    end
    return s
end

If it makes any difference, in my application I am summing over a very expensive function, instead of just integers.

Maybe try ThreadsX.sum?

1 Like

I’d like to keep the structure of a for loop because I have something like this:

function func(R)
    b = zero(paramvec)
    for i in 1:R
        sdata = datasimulator() # returns a vector of data
        b += estimatemodel(sdata) # returns a vector of parameters
    end
    avgb = b ./ R # averages parameters across simulations
    return likelihood(avgb)
end

Can that be done?

I think it can be achieved with ThreadsX.mapreduce

If your function is more expensive than summing floats then this will do the job.

(You said summing integers but your s = 0.0 is a float)

function func(R=5)
    s = Vector{Float64}(undef, R)
    Threads.@threads :static for i in 1:R
        s[i] = 1 # expensive 1
    end
    return sum(s)
end
2 Likes

I see Folds.jl is a generalisation of ThreadsX.jl, can I use multiprocessing and multithreads simultaneously?

Note though, that this can lead to false sharing.

2 Likes

I find it strange that if I start julia with -t auto, getting 4 threads, I obtain the same result as with 1 thread, but if I set -t 6 I get another result. What could be going on? how do I know that other results with auto would be correct?

This is a data race. It can happen with two threads as well!

Here is a possible scenario

A:                 B:
read s             read s
s++                s++
write s
                   write s
1 Like

I suggest these changes to avoid races:

function func(R)
    b = [zero(paramvec) for _ in 1:Threads.nthreads()] # Don't use zeros(paramvec)
    Threads.@threads :static for i in 1:R
        sdata = datasimulator() # returns a vector of data
        b[Threads.threadid()] += estimatemodel(sdata) # returns a vector of parameters
    end
    bsum = sum(b)
    avgb = bsum ./ R # averages parameters across simulations
    return likelihood(avgb)
end
1 Like

You are looking for something like Floops.jl, which hides a lot of the boilerplate. Another thing, you are missing a dot so you allocate on each iteration.

julia> using FLoops

julia> const test = rand(1000)

julia> function func(R)
           b = zero(rand(1000))
           for i in 1:R
              b .+= test # returns a vector of parameters
           end
           avgb = b ./ R # averages parameters across simulations
           return avgb
       end
func (generic function with 1 method)

julia> function func2(R)
           b = zero(rand(1000))
           @floop for i in 1:R
              @reduce b .+= test # returns a vector of parameters
           end
           avgb = b ./ R # averages parameters across simulations
           return avgb
       end
func2 (generic function with 1 method)

julia> @btime func(1000);
  129.000 μs (3 allocations: 23.81 KiB)

julia> @btime func2(1000);
  39.667 μs (35 allocations: 57.33 KiB)
2 Likes

Maybe there’s a more efficient way to do that. :wink:

In summary, what’s best to use between FLoops, Folds, ThreadsX? in terms of ease of use, for a non-computer scientist? (is that a fair comparison?)

2 Likes

:joy: I wanted to keep it as similar as the original :wink:

Whatever you find easier to use is the best.

1 Like

If I want to make use of my university computer cluster, I suppose it would be better to use multi-processing, rather than threaded execution, no?

If you want to run a computation across multiple machines, then yes.

1 Like

Ok, what’s the easiest way of doing that? Would this work?

  1. In a file named multiproc.jl, write
using Distributed
using Folds

@everywhere include("main.jl")

where main.jl is the original file that I want to run (and which runs fine serially). That main.jl file calls other .jl files, including the one in which I perform the simulations in my post above, now modified to use Folds as follows:

function func(R, params)
        function sdatamodel(i)
            sdata = simulate_data()
        return auxiliarymodel(sdata)
    end

    avgb = Folds.mapreduce(sdatamodel, +, 1:R, DistributedEx(); init=zero(params)) ./ R
    return likelihood(avgb)
end

After that, I perform some other operations, like optimizing over some parameters which necessarily is a serial operation.
2. Run multiproc.jl in the computer cluster with julia -p 100 multiproc.jl.