RNGs cannot be parallelized without modification, as the effectiveness of the randomness representation ceases if the sequences are related.
Previous discussions have been found from
But can we use some alternative solutions?
Consider some ancient algorithms that are probably not in use nowadays(most written from the book Numerical Analysis by T. Sauer)
# rng
#random number generators
#Computers are not capable of generating true random numbers
#but can generate sequences with statistical randomness
# 1 Julia's implemented solution
rand(100)
# LCP psuedo solutions
function rng_Park_Miller1998(n;x1=3)
x=zeros(n)
x[1] = x1
u = zeros(n)
for i in 2:n
x[i] =mod(16807*x[i-1],2147483647)#7^5 \ 2^31-1 31th mason prime
end
u = x./2147483647
return u
end
# implemented in Matlab 4 1990
rng_Park_Miller1998(100;x1=3)
function rng_randu(n;x1=3)
x = zeros(n)
x[1]=x1
for i in 2:n
x[i] = mod(65539*x[i-1],2147483648)
end
u = x./2147483648
return u
end
rng_randu(100)
# iterative psudo solutions
function rng_LMap(n;x1=.4)
x = zeros(n)
x[1] = x1
for i in 2:n
x[i]=1-2*x[i-1]^2
end
return((x./2) .+.5)
end
rng_LMap(100)
# self - avoiding solutio
function rng_Halton(n;x1 = 3)
b = zeros(Int(ceil(log(n)/log(x1))))
u = zeros(n)
for j in 1:n
i = 1
b[1] = b[1]+1
while b[i]>(x1-1+eps())
b[i]=0
i=i+1
b[i]=b[i]+1
end
u[j]=0
for k=1:length(b)
u[j] = u[j]+b[k]*Float64(x1)^(-k);
end
end
return u
end
rng_Halton(100)
It seems not hard if a random seq can be generated with a ārootā, so that if we pre-calculate some of the terms as starting values for each thread, then they may end up independent. Is this mistakenļ¼
I fond what Guy Steele mentioned in the last half of this talk interesting:
IIRC there is a type of SRNG that can be split in such a way that the distribution of resulting streams combined is equivalent to the original stream (or something like that). It would be nice to look into (probably some follow up studies of):
My understanding is that this would guarantee that parallel computation result with SRNG to be deterministic even when the scheduler is dynamic (as in Julia), if you construct the computation as some combination of map/filter/reduce/etc. (edit: hmmā¦ wait, does it work for any computation?)
All operations that affect the random number state ( rand , randn , seed! , etc.) will then operate on only the current threadās RNG state. This way, multiple independent code sequences that seed and then use random numbers will individually work as expected.
Using different multipliers in a linear congruential generator (or a PCG, which uses an LCG to update state) will create unique, non-overlapping streams. You can take advantage of this for SIMD as well.
This is what I have done with VectorizedRNG, and ProbabilityModels, which initializes one per thread.
You can use RNGTest for validating an RNG or LCG multiplier (note that an LCG itself will fail a lot of tests, a PCG is much better in this regard).
My understanding is that making RNG deterministic and parallel is hard when the task scheduler is dynamic as in Julia >= 1.3 (hence my last comment). Is there a packaged solution for Julia? Or maybe people actually donāt need it?
Or should we just simply calculate the length of random sequences applied in each threads and apply the jumpers afterwards
Wellā¦ thatās a sacrifice of the performance
Thanks!
So does this mean the ājumperā is already implemented or does it use somewhat different techniques to ensure the between-threads independence? So that
If I
Draw same length of sequences from each thread and then join them together
Or if I
Draw sequences from random length from each sequence and then join them together.
Will the results be a good representation of random sequences?
I guess the previous one is easier to solve. Plz point out if I am wrong
If you just want thread-independent random number (or, sequence), 1.3 should have solved the problem for you (i.e this would be the default behavior). So you can just generate random sequences in different threads and combine afterwards if thatās what you want.
I understand what it implies, but now how that is it so useful.
For example, I have seen people rely on this for testing, but I donāt think it is the right approach (as it is quite fragile to details of the algorithm).
Because getting this is quite difficult, before we focus on this as a desirable property of an RNG in a parallel setting, it would be great to understand the rationale.
the demand(?) maybe more common than you think, for example, I think most CMS(physics) papers would use seed=123456 for quite a few Monte Carole/Inference tools to make the result āreproducibleā(for debugging and peer review purpose, I only could imagine)
How about debugging some input-dependent bugs? If you have a deterministic program, you can reproduce it quite easily and re-do same execution as many as you want. If you have parallel code that depends on the exact scheduling of the tasks, I can imagine that it is extremely hard (impossible?) to reproduce the same execution reliably.
I do agree that relying on the exact output of RNG-depending code is not the right approach in the testing. Thatās because getting same output from code using RNG across library versions is hard. But I donāt think it is directly relevant for getting the same result of the same program with the same input if it only does pure computation. Also, there seems to be a way to do it already. Implementing it probably is a very hard work but I donāt think it is as difficult as improving libraries using RNG without changing the exact output.
That means if I donāt know the exact lengths of each random sequences that is called within each thread.
For example thread one has a process requiring 100 random numbers, thread 2 requiring 120. Both of them are good representations of randomness for themselves as separated event lists -stochastic processes. But if I combine these they might not practically be good representations of true random as a whole.
Is this property algorithm dependent?Maybe I should read about pcg or some other algorithms?
And I think at some point if the jumpers enable us to code and run RNG on general graphic cardsā¦ well thatās a different storyā¦ maybe forget about this:joy: