Future.randjump peculiar speed

stephenll · February 19, 2020, 4:40pm

At my company we typically break up the simulations into predefined fixed “chunks”, and assign a seed to each chunk this way the dynamic scheduler in the threading is repeatable no matter the number of threads.

I have been using Future.randjump to generate a new seed/state for each chunk. After experimenting a bit, it seems that the randjump is pretty slow for the default jump big(10)^20. I think this is supposed to be predefined.I thought by using this jump amount randjump would be faster.

In fact, I noticed no change in runtime for generating, 10 seeds, if I used the default of big(10)^20 or not.

Any thoughts on how to speed up randjump or or why the default isn’t faster? In my “real world” code, I am calling randjump in separate threads to speed things up, but it is actually slower than my simulation.

using Future, Random

seed = 1234
nchunks = 10
jumpamt = big(10)^20

a = fill(MersenneTwister(),nchunks)

m = MersenneTwister(seed)
a[1] = m

for i = 2:nchunks
 @inbounds   a[i] = Future.randjump(m,jumpamt)
end

tkf · February 21, 2020, 3:05am

Looking at calc_jump (called via randjump)

github.com

JuliaLang/julia/blob/8009429a7b236dd6aacd5929f9f09706fa2a2ec3/stdlib/Random/src/DSFMT.jl#L175-L184


      
          function calc_jump(steps::Integer,
                             charpoly::GF2X=CharPoly())::GF2X
              steps < 0 && throw(DomainError("jump steps must be >= 0 (got $steps)"))
              if isempty(JumpPolys)
                  JumpPolys[big(10)^20] = GF2X(JPOLY1e20)
              end
              get!(JumpPolys, steps) do
                  powxmod(big(steps), charpoly)
              end
          end

it looks like any steps is cached. So I guess you don’t see the difference if you use the same steps more than two times.

To be fair, the computation cached by calc_jump does seem to be much faster for big(10)^20 than other values, compared to the rest of what Future.randjump does:

julia> @btime Future.randjump(m, big(10)^20) setup=(m = MersenneTwister(0));
  14.303 ms (17 allocations: 22.77 KiB)

julia> @btime Random.DSFMT.GF2X(Random.DSFMT.JPOLY1e20);
  17.986 μs (5 allocations: 7.52 KiB)

julia> @btime Random.DSFMT.powxmod(big(10)^20 + 1, Random.DSFMT.CharPoly());
  92.881 ms (17266 allocations: 12.34 MiB)

By the way, I just realized that calc_jump uses a bare Dict to cache the jump polynomials:

github.com

JuliaLang/julia/blob/8009429a7b236dd6aacd5929f9f09706fa2a2ec3/stdlib/Random/src/DSFMT.jl#L156-L157


      
          "Cached jump polynomials for `MersenneTwister`."
          const JumpPolys = Dict{BigInt,GF2X}()

So, using randjump from different threads seems to be a bit dangerous, unless you call randjump first in the main thread.

Just a nitpick, but I guess you meant to write a[i-1] instead of m?

Topic		Replies	Views
Best practices for parallel generation of pseudo-random numbers Modelling & Simulations question	11	2175	July 25, 2019
Future.randjump and accumulate? General Usage	5	897	July 10, 2018
Parallel Mersenne Twister Internals & Design question	18	2406	January 19, 2020
How do I deal with Random Number generation when multithreading Statistics parallel	6	4493	August 31, 2017
Random seeds in parallel computing General Usage parallel	5	1888	September 20, 2020

Future.randjump peculiar speed

Related topics