I looked at LXM when I was implementing the splitmix-like thing. I had actually independently come up with a similar design. At the time I decided against it and just went with the relatively minimal fix, but that was before I was aware of the linear correlation issue with four tasks. In light of that, I’ll have to reevaluate LXM.
Rereading the paper, it doesn’t really do anything clever or helpful for the split operation:
Creating a new instance of an LXM algorithm from an existing one is done in a straightforward way: the nextLong() or nextInt() method of the existing one is used to generate values for the state variables of the LCG and XBG subgenerators and for the additive parameter of the LCG.
In other words, they just sample from the parent RNG to get the initial state of the child RNG. This is the obvious thing to do, and what we used to do, but it means that forking a task changes the parent task’s RNG state which we do not want. So, unfortunately, no insight there.
We would still need a similar design to what we’re already doing, which is to use an internal PRNG (we use PCG-RXS-M-XS-64, which seems like by far the best choice for this) to perturb the state of the child task’s main RNG relative to the parent’s main RNG. We would still need to be mindful of how that perturbation is done and avoid problems like making it easy to create observable correlations in RNG states. The LXM paper doesn’t address that at all and assumes that the RNG output is good enough to directly provide state for the child.
In short, LXM looks good and might make a good replacement main RNG in the future, but I don’t think it helps solve the task split issue at all. The benefits that LXM would potentially give are:
We could reduce the main RNG state from 256 down to 192, taking our total RNG state to 256 bits, including the internal RNG.
The best of both worlds in terms of RNG families:
LCG-based RNGs have well-known issues. The PCG approach does a really nice job of making LCGs useful by making the output function non-trivial.
XBGs like Xorshiro and Xorshift have some very bad behavior when they encounter states with not enough bits set—which are guaranteed to eventually occur since they have full period. LCG-based RNGs don’t have this problem.
Combining the output of an LCG-based RNG with the output of an XBG allows the LCG “carry the weight” while the XBG is going through a rough patch. The output will be as good as the LCG-based RNG, which would still be quite decent.
The XBG will similarly mask statistical issues that the LCG might have.
There might be a clever approach to splitting with this kind of “two engines” RNG design. An obvious trick is to perturb the state of the LCG engine but not the XBG engine, or vice versa.
Since the LCG and XBG have periods that differ by one, changing one but not the other is guaranteed to jump the overall RNG state by a large amount.
This is cute, but I think you’re just making an exact collision more likely in exchange for making landing somewhere near another task in the overall RNG sequence impossible. I’m not sure that’s a good tradeoff. Exact collisions seem worse.
Ok, that makes sense. So s4 is not a part of the RNG, does that not mean you can cache s0..s3 and restore them after initiating the child? Would that advance the splitmix state, but not change the period of the RNG? You’d need to mix in the s4 in seeding the child to avoid that problem, would an XOR work here?
I doubt I’ll come up with something helpful, but I’ve been following the thread with great interest, I hope you’ll indulge my curiosity.