CUDAnative dynamic allocation

I need to be able to dynamically allocate in kernel.

val = MArray{Tuple{4}, Float32}(undef)

is a way to get static allocation with StaticArrays.jl inside a CUDA kernel

I figured something like Array{Float32}(undef, 4) would work in a kernel but doesn’t, and I haven’t found a way to get the syntax from anything I’ve seen on C cuda.

Is this functionality available outside of calls to @cuDynamicSharedMem (which I’ve found require very careful syncing to avoid mistakes, and risk not having enough available memory) ?

Array is the Julia CPU implementation of the AbstractArray interface and will never work on the GPU. You can use StaticArrays, but you need to take care it remains stack allocated and doesn’t actually allocate on the heap:Asince we don’t have garbage collection on the GPU, heap memory will never be freed and you’ll quickly run out of dynamic memory. Using shared memory is a good alternative, but has different semantics wrt. parallelism.

StaticArrays has a dynamic array option? I can’t make any sense out of what you said, I’m trying to do:

using StaticArrays, CUDAnative

function myKern(n)
val = MArray{Tuple{n}, Float32}(undef)
return nothing

@cuda threads=1 blocks=1 myKern(2)

where memory size is not known at compile

No, but if you let a StaticArray escape, or pass it around in ways the compiler can’t reason about, it will heap allocate and result in calls to malloc. That kind of dynamic memory is very limited and untracked.

That won’t work like that. At the very least, you will need to specialize on n (i.e. pass it as a Val). But I have limited experience working with StaticArrays on the GPU, maybe @vchuravy knows of another way to, essentially, emit a variable-size alloca here without forcing respecialization.

Yeah the only way to make that work is with statically known sizes:

using StaticArrays, CUDAnative

function myKern(::Val{n}) where n
val = MArray{Tuple{n}, Float32}(undef)
return nothing

@cuda threads=1 blocks=1 myKern(Val(2))

Should work.

1 Like

While that isn’t dynamic, I realize that I can make my thing work with this, thank you for sharing the syntax too.