CUDA.randn! allocation

Hello,

I’m using CUDA.randn! on a CuArray and CUDA.@allocated reports memory allocations. This doesn’t happen for CUDA.rand!. I wonder if this is normal?

julia> rng = CURAND.default_rng()

julia> a = CuArray{Float64}(undef, 10000);

julia> CUDA.@allocated CUDA.rand!(rng, a)
0

julia> CUDA.@allocated CUDA.randn!(rng, a)
131072

Here are some more details of my installation.

julia> CUDA.versioninfo()
CUDA toolkit 11.1.1, artifact installation
CUDA driver 11.1.0
NVIDIA driver 455.32.0

Libraries:
- CUBLAS: 11.3.0
- CURAND: 10.2.2
- CUFFT: 10.3.0
- CUSOLVER: 11.0.1
- CUSPARSE: 11.3.0
- CUPTI: 14.0.0
- NVML: 11.0.0+455.32.0
- CUDNN: 8.10.0 (for CUDA 11.2.0)
- CUTENSOR: 1.2.2 (for CUDA 11.1.0)

Toolchain:
- Julia: 1.6.0
- LLVM: 11.0.1
- PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4, 6.5, 7.0
- Device support: sm_35, sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75, sm_80

1 device:
  0: GeForce RTX 2080 (sm_75, 7.254 GiB / 7.795 GiB available)

CURAND requires a power-of-2 length array for randn!, so we need to allocate a temporary buffer.

I see, Thanks!