Number of threads available seems random (using @spawn)

I think the poor scaling that you’re seeing is because the default global rng can suffer from poor performance when called from multiple threads.

IIRC, the reason was due to cache invalidation (the default global rng for each thread is stored next to the others in an array without any padding, so when one thread changes the rng state, it forces the cache on all other cores to reload… or something like that.)

[me@redmi ~]$ time ~/julia/bin/julia -t2 /tmp/error.jl
2
[DateTime("2020-08-09T15:24:23.878")]
2.499996828176266e9

real    0m20.020s
user    0m39.258s
sys     0m0.802s
[me@redmi ~]$ time ~/julia/bin/julia --inline=yes --optimize=3  --math-mode=fast --check-bounds=no -t8 /tmp/error.jl
8
[DateTime("2020-08-09T15:24:54.298"), DateTime("2020-08-09T15:24:54.298"), DateTime("2020-08-09T15:24:54.298"), DateTime("2020-08-09T15:24:54.298"), DateTime("2020-08-09T15:24:54.298"), DateTime("2020-08-09T15:24:54.298"), DateTime("2020-08-09T15:24:54.298")]
2.500007301917459e9
2.500005161541428e9
2.4999788758251624e9
2.500002200890986e9
2.500027065633972e9
2.5000236820377455e9
2.499984928049075e9

real    0m39.881s
user    4m15.892s
sys     0m0.590s
[me@redmi ~]$ vim /tmp/error.jl  # I made the code change below to explicit rng
[me@redmi ~]$ time ~/julia/bin/julia -t8 /tmp/error.jl
8
[DateTime("2020-08-09T15:28:19.239"), DateTime("2020-08-09T15:28:19.239"), DateTime("2020-08-09T15:28:19.239"), DateTime("2020-08-09T15:28:19.239"), DateTime("2020-08-09T15:28:19.239"), DateTime("2020-08-09T15:28:19.239"), DateTime("2020-08-09T15:28:19.239")]
2.500025329225046e9
2.499977304118077e9
2.5000150674239006e9
2.4999986608286796e9
2.5000115835831914e9
2.4999994211369123e9
2.500023322898655e9

real    0m19.713s
user    2m5.908s
sys     0m0.546s
using Dates
import Random

function f(i)
  rng = Random.MersenneTwister(i)
  s = 0.
  for i in 1:(5*10^9)
    s += rand(rng)
  end
  return s
end

function t()
  dates = Vector{DateTime}(undef,Threads.nthreads()-1)
  task = Vector{Task}(undef,Threads.nthreads()-1)
  for i in 1:Threads.nthreads()-1
    task[i] = Threads.@spawn f(i)
    dates[i] = Dates.now()
  end
  println(dates)
  for i in 1:Threads.nthreads()-1
    println(fetch(task[i]))
  end
end


println(Threads.nthreads())
t()
1 Like