Yeah, the whole RNG was initially designed and implemented only for device-side use, where we have to support any launch configuration. The host-side RNG wrapper around it only materialized after that.
Maybe I’m misunderstanding, but the user shouldn’t have to do anything. By default, when not explicitly setting the counter as the RNG kernel does, the generated code will seed the device-side RNG using a random seed that’s fed from the CPU side, and is unique for every kernel invocation.
It was suggested to me, in CUDA.jl v3.0 - #6 by oschulz. Being counter-based helped fix the issues we were having with the previous Tausworthe generator. Also, having an existing Julia implementation in Random123.jl helped implementing it.