For Tullio to work with GPU you need to import also:
using CUDA, CUDAKernels, KernelAbstractions
And from what I see it is some problem with your call to linear_interpolation
.
julia> @tullio o[i,j] := linear_interpolation( gx, gy[i] )(gv[j])
Error: Reason: unsupported dynamic function invocation (call to linear_interpolation)
for more you can read CUDA Troubleshooting.
If we dig deeper we can see it is some conversion function
Reason: unsupported dynamic function invocation (call to convert)
And if we inspect gy
we find out that you didn’t really move y to CuArray
julia> typeof(gy[1])
SubArray{Float64, 1, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64}, true}
(see how it is Matrix{Float64}
and not CuArray{Float64}
, that’s because broadcasting works only one level deep.)
This is the hardest part of GPU programming as GPUs don’t really like lists or Vectors of Vectors (CUDA.jl will throw:
ERROR: CuArray only supports element types that are stored inline
if you will try to do this.)
The “easiest” way to do this would be to write a custom kernel, or use einsum to account for two dimensions and store gy
as a CuArray of size (30,500).
Sorry but here my knowledge ends unfortunately. Also I don’t know about your experience but keep in mind that Julia tries to compile a GPU kernel in contrast to i.e. Python frameworks like PyTorch or CuPy where the kernels are already compiled and you only use an api to access them, so in CUDA.jl you generally need to work on Arrays of numbers only.