All the rand like function in CUDA.jl will get a CuArray. But memory allocation is not allowed in a kernel function. Thus, a CuArray can not be generated. So, are there any method to get a pure random number instead of an Array.
Host rand
calls don’t work on the GPU. We also don’t support the CURAND device API for now. See Generating Random Number from inside Kernel for possible workarounds.
IIRC @tkf also had a solution, maybe he can elaborate (Slack has eaten the history).
tkf’s implementation here Monte-Carlo π · FoldsCUDA
I suggested Julia would use this:
https://sunoru.github.io/RandomNumbers.jl/stable/man/xorshifts/
Xoroshiro128, Xoroshiro128Star and Xoroshiro128Plus: The successor to Xorshift128 series. They make use of a carefully handcrafted shift/rotate-based linear transformation, resulting in a significant improvement in speed and in statistical quality. Therefore, Xoroshiro128Plus is the current best suggestion for replacing other low-quality generators.
I’ve not looked into issues with GPUs, maybe this works as is.
FYI here’s a follow-up comment on the FoldsCUDA.jl link: Advice for improving Monte-Carlo code - #25 by tkf