CUDAnative: would it be possible to specify interface arguments rather than kernel arguments?

samo · February 28, 2020, 10:17am

Issue

Let’s consider the following simple CUDAnative example:

julia> using CuArrays, CUDAnative

julia> A = CuArrays.zeros(2,2)
2×2 CuArray{Float32,2}:
 0.0  0.0
 0.0  0.0

julia> B = CuArrays.ones(2,2)
2×2 CuArray{Float32,2}:
 1.0  1.0
 1.0  1.0

julia> function f(X, Y)
           ix = (blockIdx().x-1) * blockDim().x + threadIdx().x
           iy = (blockIdx().y-1) * blockDim().y + threadIdx().y
           X[ix,iy] = 2*Y[ix,iy]
           return
       end
f (generic function with 1 method)

julia> @cuda threads=(2,2) f(A, B)

julia> A
2×2 CuArray{Float32,2}:
 2.0  2.0
 2.0  2.0

You can observe that arrays of type CuArray{Float32,2} were passed to function f. Now, to make sure that one does not accidentally pass arrays of a different types, e.g., one CuArray{Float32,2} and one CuArray{Float64,2} to f, which would lead to a type conversion inside the kernel, one could want to fix the argument type to CuArray{Float32,2}. So, naturally, one would add this to the function signature as in function g:

function g(X::CuArray{Float32,2}, Y::CuArray{Float32,2})
    ix = (blockIdx().x-1) * blockDim().x + threadIdx().x
    iy = (blockIdx().y-1) * blockDim().y + threadIdx().y
    X[ix,iy] = 2*Y[ix,iy]
    return
end

This leads however to the following error when the function g is called:

julia> @cuda threads=(2,2) g(A, B)
ERROR: MethodError: no method matching g(::Type{CuDeviceArray{Float32,2,CUDAnative.AS.Global}}, ::Type{CuDeviceArray{Float32,2,CUDAnative.AS.Global}})
Stacktrace:
 [1] method_age(::Function, ::Tuple{DataType,DataType}) at /apps/dom/UES/jenkins/7.0.UP01/gpu/easybuild/software/Julia/1.2.0-CrayGNU-19.10-cuda-10.1/extensions/packages/CUDAnative/Lr0yj/src/execution.jl:76
 [2] macro expansion at /apps/dom/UES/jenkins/7.0.UP01/gpu/easybuild/software/Julia/1.2.0-CrayGNU-19.10-cuda-10.1/extensions/packages/CUDAnative/Lr0yj/src/execution.jl:372 [inlined]
 [3] #cufunction#176(::Nothing, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(cufunction), ::typeof(g), ::Type{Tuple{CuDeviceArray{Float32,2,CUDAnative.AS.Global},CuDeviceArray{Float32,2,CUDAnative.AS.Global}}}) at /apps/dom/UES/jenkins/7.0.UP01/gpu/easybuild/software/Julia/1.2.0-CrayGNU-19.10-cuda-10.1/extensions/packages/CUDAnative/Lr0yj/src/execution.jl:357
 [4] cufunction(::Function, ::Type) at /apps/dom/UES/jenkins/7.0.UP01/gpu/easybuild/software/Julia/1.2.0-CrayGNU-19.10-cuda-10.1/extensions/packages/CUDAnative/Lr0yj/src/execution.jl:357
 [5] top-level scope at /apps/dom/UES/jenkins/7.0.UP01/gpu/easybuild/software/Julia/1.2.0-CrayGNU-19.10-cuda-10.1/extensions/packages/CUDAnative/Lr0yj/src/execution.jl:174
 [6] top-level scope at gcutils.jl:87
 [7] top-level scope at /apps/dom/UES/jenkins/7.0.UP01/gpu/easybuild/software/Julia/1.2.0-CrayGNU-19.10-cuda-10.1/extensions/packages/CUDAnative/Lr0yj/src/execution.jl:171

In fact, the cuda kernel requires arguments of type CuDeviceArray rather than of type CuArray to work properly:

julia> function g(X::CuDeviceArray{Float32,2}, Y::CuDeviceArray{Float32,2})
           ix = (blockIdx().x-1) * blockDim().x + threadIdx().x
           iy = (blockIdx().y-1) * blockDim().y + threadIdx().y
           X[ix,iy] = 3*Y[ix,iy]
           return
       end
g (generic function with 2 methods)

julia> @cuda threads=(2,2) g(A, B)

julia> A
2×2 CuArray{Float32,2}:
 3.0  3.0
 3.0  3.0

Question

Would it be possible that, in future, the end user of CUDAnative could specify CuArray for his arguments instead of CuDeviceArray, i.e. the user would specify the arguments of the interface to his code rather than the arguments of the actual kernel that is run on the device?

Thanks!!

maleadt · February 28, 2020, 9:25pm

No, because kernels are regular functions that abide by Julia’s rules. There’s nothing magical happening here. Your best bet would be for this to be possible if CuArray is actually usable on device, i.e. remove the need for CuDeviceArray altogether, but I don’t see that happening anytime soon (we’d need really powerful contextual dispatch for that).

Alternatively, you could write a macro to prefix kernel definitions with and rewrite type signatures similarly to how values are converted at the CPU-GPU boundary (CuArray->CuDeviceArray, for example), but I don’t think there’s much interest in that. Functions don’t typically need to be tightly typed like that.

samo · March 2, 2020, 9:24am

OK, thanks for the reply!

Topic		Replies	Views
Casting, annotations and numeric types for CUDAnative GPU type , parametric-types	5	1487	January 21, 2019
What is the maximal number of arguments a CUDAnative kernel can take? argc = 16 yields "Error: invalid kernel call; too many arguments" GPU cudanative	6	1969	July 4, 2018
CUDAnative: kernel multidimensional access GPU cudanative	3	1188	February 3, 2017
Passing a wrapped array to a kernel GPU	2	557	May 27, 2020
Sparse arrays and function arrays in CUDAnative.jl GPU cudanative	12	1774	July 10, 2017

CUDAnative: would it be possible to specify interface arguments rather than kernel arguments?

Issue

Question

Related topics