Using Interpolations.jl on CuVector

I am feeding a CuVector to an interpolator created with Interpolations.jl and wanting to get the output as a CuVector, but I am not sure how to achieve this.

Here is what I did. I create an interpolator using sampling points:

using Interpolations
xsample = 0:0.1:1
ysample = rand(length(xsample))
itp = CubicSplineInterpolation(xsample, ysample)

Then I create a CuVector that contains the x-values where I would like to evaluate the interpolator, and perform the interpolation:

using CUDA
cx = cu(rand(100))
y = itp(cx)

The type of y is Vector, not CuVector. I tried to use CuVector as ysample when creating itp, but the result was the same.

I also tried to preallocate y as a CuVector and use the dot syntax for element-wise assignment:

cy = similar(cx)
cy .= itp.(cx)

but this generates an error:

ERROR: GPU compilation of kernel broadcast_kernel(CUDA.CuKernelContext, CuDeviceVector{Float32, 1}, Base.Broadcast.Broadcasted{Nothing, Tuple{Base.OneTo{Int64}}, Interpolations.Extrapolation{Float32, 1, ScaledInterpolation{Float32, 1, Interpolations.BSplineInterpolation{Float32, 1, OffsetArrays.OffsetVector{Float32, Vector{Float32}}, BSpline{Cubic{Line{OnGrid}}}, Tuple{Base.OneTo{Int64}}}, BSpline{Cubic{Line{OnGrid}}}, Tuple{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}}}}, BSpline{Cubic{Line{OnGrid}}}, Throw{Nothing}}, Tuple{Base.Broadcast.Extruded{CuDeviceVector{Float32, 1}, Tuple{Bool}, Tuple{Int64}}}}, Int64) failed
KernelError: passing and using non-bitstype argument

I will appreciate any help!

As far as I know, Interpolations.jl is not compatable with CUDA.jl at present.
see GPU support? · Issue #357 · JuliaMath/Interpolations.jl · GitHub

@N5N3, thanks for pointing out the Issue.

I wonder if anyone knows any alternative packages that support GPU interpolations exist. I checked a few packages listed in Home · Interpolations.jl, but none of them explicitly indicates GPU support.

With ExBroadcast, you can do BSpline Interpolations on Nv’s GPU via broadcast interface.

Example code:

module CUDAInterp
using ExBroadcast, Interpolations, CUDA, Adapt
a = randn(ComplexF32,401,401)
itp = CubicSplineInterpolation((-100:0.5f0:100,-100:0.5f0:100), a) # only BSpline is tested (Lanczos should also work)
cuitp = adapt(CuArray,itp) # you need to transfer the data into GPU's memory
cuitp.(-100:0.01f0:100,(-100:0.01f0:100)') # 20001×20001 CuArray{ComplexF32, 2}
cuitp.(CUDA.randn(100),CUDA.randn(100)) # 100 CuArray{ComplexF32, 1}
cuitp.(randn(100),randn(100)) # error