Multi-threading

krenova · April 3, 2019, 5:44pm

I’m using windows 10 and julia 1.0.3 and testing out the multi-threaded function.

I understand that when generating random numbers we would need a separate seed for each thread but ignoring that for the purpose of a very preliminary test, I created the following code. In the following im trying to assess the overheads of calling a multi-threaded for loop multiple times.

#set JULIA_NUM_THREADS=4

test = zeros(10000,1000);

@time for i in 1:10000
    Threads.@threads for j in 1:1000
        test[i,j] = rand(1)[1]
    end
end

When I ran the above code, I was returned a continuous stream of error messages, their counts likely correlated to the number of times the threaded loop was called. Sample of the errors are:

Error thrown in threaded loop on thread 0: InexactError(func=:check_top_bit, T=Int64, val=-8)
Error thrown in threaded loop on thread 2: InexactError(func=:check_top_bit, T=Int64, val=-8)
Error thrown in threaded loop on thread 1: InexactError(func=:check_top_bit, T=Int64, val=-8)
Error thrown in threaded loop on thread 0: InexactError(func=:check_top_bit, T=Int64, val=-8)
Error thrown in threaded loop on thread 3: InexactError(func=:check_top_bit, T=Int64, val=-8)
Error thrown in threaded loop on thread 1: InexactError(func=:check_top_bit, T=Int64, val=-8)
Error thrown in threaded loop on thread 2: InexactError(func=:check_top_bit, T=Int64, val=-8)
Error thrown in threaded loop on thread 3: InexactError(func=:check_top_bit, T=Int64, val=-8)

True enough, when i did a sum to check the results, sum(sum(test,dims=1),dims=2), my output was about 40081, way lesser than the expected value of 0.5 \times 10000 \times 1000.

Wondering what mistake I may have made…

Edit1: To think of it I suspect it’s sychronization issues, some threads may have finished and moved on to the next loop?

mbauman · April 3, 2019, 5:59pm

That’s precisely the problem. It’s not just that you need separate seeds — it’s that you need wholly distinct random number generators! Remember that calling rand doesn’t just return some value, it’s also mutating the state of your generator to move to the next value. When multiple threads try to mutate this state at the same time, you can rapidly run into inconsistencies.

yuyichao · April 3, 2019, 6:00pm

You can’t. Calling the same random number generator from multiple thread is undefined behavior and anything can happen.

Practically, this at least mean that you can corrupt/have corrupted the random number generate (giving you the error) and you’ll not get the correct performance measure. You’ll get very bad performance even without any error.

krenova · April 3, 2019, 6:01pm

Ah gosh, I see, this makes sense now. Thanks @mbauman and @yuyichao

Elrod · April 3, 2019, 7:33pm

I’d take a look at trandjump of you’d like solutions.

zymbuzz · April 3, 2019, 8:01pm

Does this also apply to distributed computing across cores? If let’s say instead of Threads.@threads, I use @distributed?

yuyichao · April 3, 2019, 8:08pm

No

DNF · April 3, 2019, 8:27pm

Byt the way, and incidentally, when people write rand(1)[1], what they almost always want is just rand().

rand(1)[1] allocates a length-1 vector and then reads out the first number, while rand() just returns a random value.

Aquaman · April 4, 2019, 2:01am

Hi @DNF , @mbauman ,@Elrod,

is Threads.@threads equal to the command parfor in MATLAB?

yuyichao · April 4, 2019, 2:43am

No afaict MATLAB do not have user controlled thread.

krenova · April 4, 2019, 2:46am

Good question.
@yuyichao, while there maybe no race conditions issue with generating random numbers with @distributed, I guess its safe to say that the random numbers also dont run the risk of overlaps since no race condition issues should mean that there is some synchronization involved between workers?

Tamas_Papp · April 4, 2019, 11:06am

I sometimes toy with the idea of calling it rand!. Strictly speaking, rand is a lie.

Aquaman · April 4, 2019, 11:27am

parfor

Execute for -loop iterations in parallel on workers

in Julia Threads.@threads + for == parfor in Matlab ???

Bernard_GODARD · April 4, 2019, 11:59am

This works:

test = zeros(10000,1000);

using Random


const generators = Dict{Int,Any}()
const mutex = Threads.Mutex()


function randomNumber()
   i = Threads.threadid()
   if !haskey(generators,i)
      lock(mutex)
      println("Creating random number generator for thread $i")
      generators[i] = MersenneTwister(rand(RandomDevice(),UInt8))
      unlock(mutex)
   end
   return rand(generators[i])
end


@time for i in 1:10000
    Threads.@threads for j in 1:1000
        test[i,j] = randomNumber()
    end
end

println(sum(test)/length(test))

prints

Creating random number generator for thread 1
Creating random number generator for thread 4
Creating random number generator for thread 2
Creating random number generator for thread 3
  1.071951 seconds (7.12 M allocations: 132.371 MiB, 17.82% gc time)
0.49999995453686974

However I don’t know if it can be improved by using a strongly typed dictionary instead of Dict{Int,Any}

ChrisRackauckas · April 4, 2019, 12:02pm

No. It is kind of like @parallel with some automatic data transfers. It requires separate processes, so much that using parfor on a cluster requires a new license for each core! That is very different than threading.

DNF · April 4, 2019, 12:06pm

Workers and threads are not the same. You can google “threads vs processes” or something to get an explanation. Workers are the same as processes.

kristoffer.carlsson · April 4, 2019, 12:23pm

No, parfor spawns multiple processes.

nilshg · April 4, 2019, 12:55pm

Wouldn’t it be using Distributed; @distributed for these days?

jpsamaroo · April 4, 2019, 3:28pm

A few changes, namely concretely typing the generators dictionary, hoisting generator initialization out of the loop, and putting stuff into a function to avoid the global scope:

using Random
using BenchmarkTools

const generators = Dict{Int,MersenneTwister}()

function dothread(gen)
    for i in 1:Threads.nthreads()
        if !haskey(generators,i)
            println("Creating random number generator for thread $i")
            gen[i] = MersenneTwister(rand(RandomDevice(),UInt8))
        end
    end
    test = zeros(10000,1000)
    Threads.@threads for j in 1:1000
        tidx = Threads.threadid()
        for i in 1:10000
            test[i,j] = rand(gen[tidx])
        end
    end
    return test
end

@btime dothread($generators)

test = dothread(generators)
println(sum(test)/length(test))

JULIA_NUM_THREADS=4 julia threaded-rand.jl 
Creating random number generator for thread 1
Creating random number generator for thread 2
Creating random number generator for thread 3
Creating random number generator for thread 4
  62.128 ms (32 allocations: 76.30 MiB)
0.49998218701926284

Topic		Replies	Views
Random numbers and threads General Usage question , multithreading , random	30	3862	March 29, 2022
Help with Julia multithreading General Usage question , multithreading	9	442	January 23, 2024
Multithreading and rand within a specific range New to Julia	5	489	November 27, 2019
How to run multiple threads, all using the same sequence of pseudorandom numbers, in Julia? General Usage multithreading , random	9	1081	August 30, 2021
Embarrassingly parallel multi-threading doesn't scale Performance multithreading	17	1613	October 16, 2021

Multi-threading

parfor

Related topics