Hi everyone,
I would like to evaluate the associated Legendre polynomials P_l^m (\cos{\theta}) on the GPU. I tried the packages AssociatedLegendrePolynomials.jl and LegendrePolynomials.jl, both of which work as intended for normal arrays but fail for CuArrays. The errors are different in each case and difficult for me to understand. I am not sure if GPU compatibility is not provided or if I am doing something wrong myself. Is there a way to make this work using these packages or some different approach?
Thanks for the help. Below is a MWE.
using AssociatedLegendrePolynomials
using LegendrePolynomials
using Lux
using LuxCUDA
θ = range(0, π, 100)
θ_gpu = range(0, π, 100) |> gpu_device() .|> Float64
Plm1 = AssociatedLegendrePolynomials.Plm(1, 0, cos.(θ)) # works
Plm2 = LegendrePolynomials.Plm.(cos.(θ), 1, 0) # works and gives same result as above
Plm1_gpu = AssociatedLegendrePolynomials.Plm(1, 0, cos.(θ_gpu)) #throws error: Scalar indexing is disallowed
Plm2_gpu = LegendrePolynomials.Plm.(cos.(θ_gpu), 1, 0) # throws error: InvalidIRError: compiling MethodInstance
Here are the exact errors
julia> Plm1_gpu = AssociatedLegendrePolynomials.Plm(1, 0, cos.(θ_gpu))
ERROR: Scalar indexing is disallowed.
Invocation of getindex resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations *do not* execute on the GPU, but very slowly on the CPU,
and therefore should be avoided.
If you want to allow scalar iteration, use `allowscalar` or `@allowscalar`
to enable scalar iteration globally or for the operations in question.
Stacktrace:
julia> Plm2_gpu = LegendrePolynomials.Plm.(cos.(θ_gpu), 1, 0)
ERROR: InvalidIRError: compiling MethodInstance for (::GPUArrays.var"#34#36")(::CUDA.CuKernelContext, ::CuDeviceVector{…}, ::Base.Broadcast.Broadcasted{…}, ::Int64) resulted in invalid LLVM IR
Reason: unsupported call through a literal pointer (call to mpfr_exp)
Stacktrace: