Cuda has random numbers CUDA.rand(2) but I can’t see how/if I can create them inside a @kernel .
I need x,y pairs in a specific range and x^2 + y^ <= 1
I used just plain rand()
and got
... resulted in invalid LLVM IR
Reason: unsupported call to an unknown function (call to julia.get_pgcstack)
maleadt
February 15, 2022, 9:22am
2
rand()
is supported in CUDA kernels. You seem to be using Julia 1.7, make sure you’re also using the latest version of GPUCompiler.jl, which should handle the get_pgcstack
intrinsic correctly.
Here’s an MWE in a new environment, with a square! function to show CUDA is working.
julia> versioninfo()
Julia Version 1.7.2
Commit bf53498635 (2022-02-06 15:21 UTC)
Platform Info:
OS: Linux (x86_64-pc-linux-gnu)
CPU: Intel(R) Xeon(R) CPU E5-2660 v3 @ 2.60GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-12.0.1 (ORCJIT, haswell)
Environment:
JULIA_CPU_THREADS = 40
JULIA_NUM_THREADS = 40
$ mkdir gpurand && cd gpurand && julia --project=. -q
julia> using GPUCompiler
...
[61eb1bfa] + GPUCompiler v0.13.11
julia> using CUDA
...
[052768ef] + CUDA v3.8.0
julia> using KernelAbstractions
...
[63c18a36] + KernelAbstractions v0.7.2
julia> using CUDAKernels
...
[72cfdca4] + CUDAKernels v0.3.3
julia> s = CUDA.rand(2)
2-element CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}:
0.049182236
0.29649353
julia> @kernel function square!(S)
I = @index(Global)
@inbounds S[I] = S[I]^2
end
square! (generic function with 5 methods)
julia> sq! = square!(CUDADevice(), 8)
julia> wait(sq!(s, ndrange=length(s))) ; s
2-element CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}:
0.0024188925
0.08790842
julia> @kernel function rand!(R)
I = @index(Global)
@inbounds R[I] = rand()
end
julia> r = CUDA.zeros(16)
16-element CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}:
0.0
⋮
0.0
julia> r! = rand!(CUDADevice(), 8)
KernelAbstractions.Kernel{CUDADevice, KernelAbstractions.NDIteration.StaticSize{(8,)}, KernelAbstractions.NDIteration.DynamicSize, typeof(gpu_rand!)}(gpu_rand!)
julia> wait(r!(r, ndrange=length(r)))
ERROR: InvalidIRError: compiling kernel gpu_rand!(Cassette.Context{nametype(CUDACtx), Nothing, Nothing, KernelAbstractions.var"##PassType#291", Nothing, Cassette.DisableHooks}, typeof(gpu_rand!), KernelAbstractions.CompilerMetadata{KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicCheck, Nothing, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}, KernelAbstractions.NDIteration.NDRange{1, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.StaticSize{(8,)}, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}, Nothing}}, CuDeviceVector{Float32, 1}) resulted in invalid LLVM IR
Reason: unsupported call to an unknown function (call to julia.get_pgcstack)
I suspect this is Cassette.jl messing things up. This should hopefully be fixed by https://github.com/JuliaGPU/KernelAbstractions.jl/pull/288