Kernel Error: kernel returns a value of type `Union{}`

I am trying to port my Julia code to be able to run with GPU support. I use LightGraphs package. My code and error message are as follows:


mutable struct Para
  g::SimpleDiGraph{Int64} ;
  o::CuArray{Int64,1} ;

function tr(cmin,cmax,N,P,od,Pr,sigma,Pi)
  p = Para(SimpleDiGraph(Int64),CuArrays.fill(0,p.N)) ;
  while (minimum(p.o)==0)
    p.g = watts_strogatz(N,od,Pr,is_directed=true) ;
    p.o = indegree(p.g) ;

Running the function:

@device_code_warntype @cuda threads = 100 tr(cmin,cmax,N,P,od,Pr,sigma,Pi)


  #self#::Core.Compiler.Const(tr, false)

1 ─     Main.SimpleDiGraph(Main.Int64)
β”‚       CuArrays.fill
β”‚       Base.getproperty(p, :N)
β”‚       Core.Compiler.Const(:((%2)(0, %3)), false)
β”‚       Core.Compiler.Const(:(p = Main.Para(%1, %4)), false)
β”‚       Core.Compiler.Const(:(Base.getproperty(p, :o)), false)
β”‚       Core.Compiler.Const(:(Main.minimum(%6)), false)
β”‚       Core.Compiler.Const(:(%7 == 0), false)
β”‚       Core.Compiler.Const(:(%8), false)
β”‚       Core.Compiler.Const((:is_directed,), false)
β”‚       Core.Compiler.Const(:(Core.apply_type(Core.NamedTuple, %10)), false)
β”‚       Core.Compiler.Const(:(Core.tuple(true)), false)
β”‚       Core.Compiler.Const(:((%11)(%12)), false)
β”‚       Core.Compiler.Const(:(Core.kwfunc(Main.watts_strogatz)), false)
β”‚       Core.Compiler.Const(:((%14)(%13, Main.watts_strogatz, N, od, Pr)), false)
β”‚       Core.Compiler.Const(:(Base.setproperty!(p, :g, %15)), false)
β”‚       Core.Compiler.Const(:(Base.getproperty(p, :g)), false)
β”‚       Core.Compiler.Const(:(Main.indegree(%17)), false)
β”‚       Core.Compiler.Const(:(Base.setproperty!(p, :o, %18)), false)
β”‚       Core.Compiler.Const(:(goto %6), false)
└──     Core.Compiler.Const(:(return Main.nothing), false)
GPU compilation of tr(Float64, Float64, Int64, Int64, Int64, Float64, Int64, Float64) failed
KernelError: kernel returns a value of type `Union{}`

Make sure your kernel function ends in `return`, `return nothing` or `nothing`.
If the returned value is of type `Union{}`, your Julia code probably throws an exception.
Inspect the code with `@device_code_warntype` for more details.

 [1] check_method(::CUDAnative.CompilerJob) at /root/.julia/packages/CUDAnative/Lr0yj/src/compiler/validation.jl:16
 [2] #codegen#136(::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::typeof(CUDAnative.codegen), ::Symbol, ::CUDAnative.CompilerJob) at /root/.julia/packages/TimerOutputs/7zSea/src/TimerOutput.jl:216
 [3] #codegen at /root/.julia/packages/CUDAnative/Lr0yj/src/compiler/driver.jl:0 [inlined]
 [4] #compile#135(::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::typeof(CUDAnative.compile), ::Symbol, ::CUDAnative.CompilerJob) at /root/.julia/packages/CUDAnative/Lr0yj/src/compiler/driver.jl:47
 [5] #compile#134 at ./none:0 [inlined]
 [6] #compile at ./none:0 [inlined] (repeats 2 times)
 [7] macro expansion at /root/.julia/packages/CUDAnative/Lr0yj/src/execution.jl:389 [inlined]
 [8] #cufunction#176(::Nothing, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(cufunction), ::typeof(tr), ::Type{Tuple{Float64,Float64,Int64,Int64,Int64,Float64,Int64,Float64}}) at /root/.julia/packages/CUDAnative/Lr0yj/src/execution.jl:357
 [9] cufunction(::Function, ::Type) at /root/.julia/packages/CUDAnative/Lr0yj/src/execution.jl:357
 [10] top-level scope at /root/.julia/packages/CUDAnative/Lr0yj/src/execution.jl:174
 [11] top-level scope at gcutils.jl:87
 [12] top-level scope at /root/.julia/packages/CUDAnative/Lr0yj/src/execution.jl:171
 [13] top-level scope at /root/.julia/packages/CUDAnative/Lr0yj/src/reflection.jl:164
 [14] top-level scope at In[39]:1

If you look at that IR, it does not compute anything meaningful but bails out very early to throw an error (which breaks GPU compilation). The last β€œcall” you can see is to CuArrays.fill, one of whose arguments is the undefined p.

That said, you can’t just slap @cuda before a complex GPU call and expect it to work. The functions you are calling, like CuArrays.fill, constructing a SimplDiGraph, etc will simply not work on a GPU. Instead, you need simple parallel kernels. I recommend you try to use CuArrays.jl instead, where high-level operations are implemented using GPU kernels wherever possible.

Thank you for the reply. I am new to the use of GPU support, and have been trying to learn from the examples provided in the documentation of the packages. Is there a tutorial that you’d recommend that provides some insight into the way the variables are to be declared, passed, and the dos/donts of GPU programming?

Documentation is lacking, yeah, but don’t hesitate to ask questions here or on Slack.
The one tutorial out there,, does provide some guidance though. Do know that for the most part, you can use CuArrays.jl without the need to write kernels, and thus without needing to call @cuda (which is only used to launch kernels). If you do need a custom kernel (few people do), you’ll need to start simple because not all of Julia is supported on the GPU. Don’t just put @cuda in front of an existing computation, but read about the GPU programming model and start with simple computations.

1 Like