Pointer to CuArray as a parameter for DifferentialEquations right-hand-side function

fedoroff · August 2, 2020, 5:02pm

Thank you for the suggestion. It seems KernelAbstractions allow me to do what I want:

import CUDA
using KernelAbstractions

CUDA.allowscalar(false)


@kernel function func(u, p)
  f, = p
  I = @index(Global)
  u[I] = f[I]
end


N = 2048
f = ones(Float32, N)

u_gpu = CUDA.zeros(N)
f_gpu = CUDA.CuArray(f)
p = (f_gpu, )

kernel = func(CUDADevice(), 32)

# launch kernel with original f ------------------------------------------------
event = kernel(u_gpu, p; ndrange=size(u_gpu))
wait(event)

u = CUDA.collect(u_gpu)
@show isequal(u, f)   # -> isequal(u, f) = true

# change f outside of kernel and launch the kernel once more -------------------
@. f_gpu = 2 * f_gpu
ev = Event(CUDADevice())   # needed for synchronization
event = kernel(u_gpu, p; ndrange=size(u_gpu), dependencies=ev)
wait(event)

u = CUDA.collect(u_gpu)
@show isequal(u, 2 .* f)   # -> isequal(u, 2 .* f) = true

As far as I understand, somewhere under the hood KernelAbstractions convert f into CuDeviceArray. I want to understand how I can do it by myself. If someone can help me with that, e.g. give a link to a proper line in sources I would be very appreciated (of cause, I will try to find it by myself, though I failed after a brief look). A direct conversion, f_cda = CUDA.CuDeviceArray(f_gpu.dims, CUDA.DevicePtr(pointer(f_gpu))), does not work — I obtain the ReadOnlyMemoryError() when I try to access f_cda.

Also I have a questions about the choice of the number of cuda threads for KernelAbstractions (equal to 32 in the above code). For example, in DiffEqGPU.jl the amount of cuda threads is hard coded and equal to 256 (https://github.com/SciML/DiffEqGPU.jl/blob/7c4398df1d069dea17c8292ae067329670a6e4fc/src/DiffEqGPU.jl#L40). Can I choose it more wisely, e.g. like in CUDA.jl where I can use get_config function (see The most general way to estimate the optimal arguments for @cuda macro)?

Topic		Replies	Views
DifferentialEquations inside CUDA kernels General Usage gpu	3	584	August 26, 2019
CUDA kernel: how to pass an array of functions GPU cuda	7	1479	February 8, 2021
DifferentialEquations + RecursiveArrayTools Broadcasting / CuArrays Kernel compilation error -- GPU diffeq	6	862	February 4, 2020
Store CuArrays on a mutable struct? GPU	5	1422	July 2, 2018
Passing a wrapped array to a kernel GPU	2	539	May 27, 2020

Pointer to CuArray as a parameter for DifferentialEquations right-hand-side function

Related topics