Issue
Let’s consider the following simple CUDAnative example:
julia> using CuArrays, CUDAnative
julia> A = CuArrays.zeros(2,2)
2×2 CuArray{Float32,2}:
0.0 0.0
0.0 0.0
julia> B = CuArrays.ones(2,2)
2×2 CuArray{Float32,2}:
1.0 1.0
1.0 1.0
julia> function f(X, Y)
ix = (blockIdx().x-1) * blockDim().x + threadIdx().x
iy = (blockIdx().y-1) * blockDim().y + threadIdx().y
X[ix,iy] = 2*Y[ix,iy]
return
end
f (generic function with 1 method)
julia> @cuda threads=(2,2) f(A, B)
julia> A
2×2 CuArray{Float32,2}:
2.0 2.0
2.0 2.0
You can observe that arrays of type CuArray{Float32,2}
were passed to function f
. Now, to make sure that one does not accidentally pass arrays of a different types, e.g., one CuArray{Float32,2}
and one CuArray{Float64,2}
to f
, which would lead to a type conversion inside the kernel, one could want to fix the argument type to CuArray{Float32,2}
. So, naturally, one would add this to the function signature as in function g
:
function g(X::CuArray{Float32,2}, Y::CuArray{Float32,2})
ix = (blockIdx().x-1) * blockDim().x + threadIdx().x
iy = (blockIdx().y-1) * blockDim().y + threadIdx().y
X[ix,iy] = 2*Y[ix,iy]
return
end
This leads however to the following error when the function g
is called:
julia> @cuda threads=(2,2) g(A, B)
ERROR: MethodError: no method matching g(::Type{CuDeviceArray{Float32,2,CUDAnative.AS.Global}}, ::Type{CuDeviceArray{Float32,2,CUDAnative.AS.Global}})
Stacktrace:
[1] method_age(::Function, ::Tuple{DataType,DataType}) at /apps/dom/UES/jenkins/7.0.UP01/gpu/easybuild/software/Julia/1.2.0-CrayGNU-19.10-cuda-10.1/extensions/packages/CUDAnative/Lr0yj/src/execution.jl:76
[2] macro expansion at /apps/dom/UES/jenkins/7.0.UP01/gpu/easybuild/software/Julia/1.2.0-CrayGNU-19.10-cuda-10.1/extensions/packages/CUDAnative/Lr0yj/src/execution.jl:372 [inlined]
[3] #cufunction#176(::Nothing, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(cufunction), ::typeof(g), ::Type{Tuple{CuDeviceArray{Float32,2,CUDAnative.AS.Global},CuDeviceArray{Float32,2,CUDAnative.AS.Global}}}) at /apps/dom/UES/jenkins/7.0.UP01/gpu/easybuild/software/Julia/1.2.0-CrayGNU-19.10-cuda-10.1/extensions/packages/CUDAnative/Lr0yj/src/execution.jl:357
[4] cufunction(::Function, ::Type) at /apps/dom/UES/jenkins/7.0.UP01/gpu/easybuild/software/Julia/1.2.0-CrayGNU-19.10-cuda-10.1/extensions/packages/CUDAnative/Lr0yj/src/execution.jl:357
[5] top-level scope at /apps/dom/UES/jenkins/7.0.UP01/gpu/easybuild/software/Julia/1.2.0-CrayGNU-19.10-cuda-10.1/extensions/packages/CUDAnative/Lr0yj/src/execution.jl:174
[6] top-level scope at gcutils.jl:87
[7] top-level scope at /apps/dom/UES/jenkins/7.0.UP01/gpu/easybuild/software/Julia/1.2.0-CrayGNU-19.10-cuda-10.1/extensions/packages/CUDAnative/Lr0yj/src/execution.jl:171
In fact, the cuda kernel requires arguments of type CuDeviceArray
rather than of type CuArray
to work properly:
julia> function g(X::CuDeviceArray{Float32,2}, Y::CuDeviceArray{Float32,2})
ix = (blockIdx().x-1) * blockDim().x + threadIdx().x
iy = (blockIdx().y-1) * blockDim().y + threadIdx().y
X[ix,iy] = 3*Y[ix,iy]
return
end
g (generic function with 2 methods)
julia> @cuda threads=(2,2) g(A, B)
julia> A
2×2 CuArray{Float32,2}:
3.0 3.0
3.0 3.0
Question
Would it be possible that, in future, the end user of CUDAnative could specify CuArray
for his arguments instead of CuDeviceArray
, i.e. the user would specify the arguments of the interface to his code rather than the arguments of the actual kernel that is run on the device?
Thanks!!