Julia: How to set and use CUDA/Host Global memory

I want to poll a global flag inside of a kernel in CUDA so as to shut down the kernel gracefully via the Host.

  1. How do I set up a Global flag in the host (I have tried abort=CuArray([false]) )
  2. Do I pass ‘abort’ as an arg to the kernel function? and can I check it’s value there?
  3. While the kernel is running (takes about 3 seconds) without sync: after 0.5 sec I set ‘abort’ to true in the host-side, but this has no affect - the kernel carries on running to the 3 second completion.

a = cu([0]; unified=true)

function kernel(b)
    b[1] > 0 && (@cuprintln("ABORTED"); return nothing)
    for _ in 1:200000
        a = sqrt(2)
    end
    return nothing
end
println("START")
fill!(a, 0)
@cuda threads=16 blocks=1 kernel(a)
fill!(a, 1) # has no effect on the kernel
println("DONE")

I’m doing something basically wrong here. Will someone put me right please?

1 Like

That won’t work like that; the fill! uses an API call which is executed in-order, and thus waits for the kernel to complete. Either perform that call on a different stream (using CUDA.stream!, or from a different task), or use an allocation that doesn’t require. Typically that’s a device-mapped host allocation, using Mem.alloc(Mem.Host, sizeof(Int), Mem.HOSTALLOC_DEVICEMAP) (then wrapped to an Array using unsafe_wrap). See the exception_flag in CUDA.jl for an example use of this.

Thanks for that.
I’m a newbie at CUDA.jl although I read up on CUDA several years ago.
How would I use CUDA.stream! ?
Would it be
CUDA.stream!(CuStream()) do; fill!(a,1);end

Something like that should work, yes. You can also use tasks as shown here: CUDA.jl 3.0 ⋅ JuliaGPU

But if you just want to set a single flag, using a device-mapped host allocation is probably a better choice (also guarantees that the GPU will read the flag as soon as its set on the CPU).