Background:
I changed my task to a simple example for description:
I will take every element in array x, make it into a third-order matrix and calculate its determinant, and then assign the value of the determinant to array y.
My code:
using CuArrays
using CUDAnative
using LinearAlgebra
function test(x)
mat = [x+1 x+2 x+5
x+1 x+0 x+4
x+2 x+3 x+2]
c = det(mat)
return c
end
function kernel!(x,y)
index = (blockIdx().x - 1) * blockDim().x + threadIdx().x
stride = blockDim().x * gridDim().x
for i = index:stride:size(x,1)
y[i] = test(x[i])
end
return nothing
end
x = rand(10000)
y = zeros(10000)
d_x = cu(x)
d_y = cu(y)
numblocks = ceil(Int, size(x, 1)/256)
@cuda threads = 256 blocks = numblocks kernel!(d_x,d_y)
Result:
GPU compilation of kernel!(CuDeviceArray{Float32,1,CUDAnative.AS.Global}, CuDeviceArray{Float32,1,CUDAnative.AS.Global}) failed
KernelError: recursion is currently not supported
Try inspecting the generated code with any of the @device_code_... macros.
Stacktrace:
[1] mapreduce_impl at reduce.jl:148 (repeats 2 times)
[2] det at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.3\LinearAlgebra\src\triangular.jl:2525
[3] det at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.3\LinearAlgebra\src\generic.jl:1421
[4] kernel! at In[3]:2
Stacktrace:
[1] (::CUDAnative.var"#hook_emit_function#100"{CUDAnative.CompilerJob,Array{Core.MethodInstance,1}})(::Core.MethodInstance, ::Core.CodeInfo, ::UInt64) at C:\Users\zenan\.julia\packages\CUDAnative\Phjco\src\compiler\irgen.jl:102
[2] compile_method_instance(::CUDAnative.CompilerJob, ::Core.MethodInstance, ::UInt64) at C:\Users\zenan\.julia\packages\CUDAnative\Phjco\src\compiler\irgen.jl:149
[3] macro expansion at C:\Users\zenan\.julia\packages\TimerOutputs\7Id5J\src\TimerOutput.jl:228 [inlined]
[4] irgen(::CUDAnative.CompilerJob, ::Core.MethodInstance, ::UInt64) at C:\Users\zenan\.julia\packages\CUDAnative\Phjco\src\compiler\irgen.jl:163
[5] macro expansion at C:\Users\zenan\.julia\packages\TimerOutputs\7Id5J\src\TimerOutput.jl:228 [inlined]
[6] macro expansion at C:\Users\zenan\.julia\packages\CUDAnative\Phjco\src\compiler\driver.jl:99 [inlined]
[7] macro expansion at C:\Users\zenan\.julia\packages\TimerOutputs\7Id5J\src\TimerOutput.jl:228 [inlined]
[8] #codegen#156(::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::typeof(CUDAnative.codegen), ::Symbol, ::CUDAnative.CompilerJob) at C:\Users\zenan\.julia\packages\CUDAnative\Phjco\src\compiler\driver.jl:98
[9] #codegen at .\none:0 [inlined]
[10] #compile#155(::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::typeof(CUDAnative.compile), ::Symbol, ::CUDAnative.CompilerJob) at C:\Users\zenan\.julia\packages\CUDAnative\Phjco\src\compiler\driver.jl:47
[11] #compile#154 at .\none:0 [inlined]
[12] #compile at .\none:0 [inlined] (repeats 2 times)
[13] macro expansion at C:\Users\zenan\.julia\packages\CUDAnative\Phjco\src\execution.jl:392 [inlined]
[14] #cufunction#200(::Nothing, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(cufunction), ::typeof(kernel!), ::Type{Tuple{CuDeviceArray{Float32,1,CUDAnative.AS.Global},CuDeviceArray{Float32,1,CUDAnative.AS.Global}}}) at C:\Users\zenan\.julia\packages\CUDAnative\Phjco\src\execution.jl:359
[15] cufunction(::Function, ::Type) at C:\Users\zenan\.julia\packages\CUDAnative\Phjco\src\execution.jl:359
[16] top-level scope at C:\Users\zenan\.julia\packages\CUDAnative\Phjco\src\execution.jl:176
[17] top-level scope at gcutils.jl:91
[18] top-level scope at C:\Users\zenan\.julia\packages\CUDAnative\Phjco\src\execution.jl:173
[19] top-level scope at In[5]:2
Question:
The above example is very similar to the task I actually want to complete. Because I am new in using GPU, I cannot understand the error reporting. Excuse me, how should this mistake be solved?