I wonder, if this is a known issue. Quick Google search has not revealed anything.
using CuArrays, Flux
f(x) = sum(mapslices(t->sqrt(sum(t.*t)), x, dims=2))
df(x) = Tracker.gradient(f, x)
x = rand(10, 10) |> gpu
df(x)
ERROR: InvalidIRError: compiling setindex_kernel!(CuArrays.CuKernelState, CUDAnative.CuDeviceArray{Tracker.TrackedReal{Float32},2,CUDAnative.AS.Global}, CUDAnative.CuDeviceArray{Tracker.TrackedReal{Float32},1,CUDAnative.AS.Global}, Tuple{Int64,Int64}, Tuple{Int64,Base.OneTo{Int64}}, Int64) resulted in invalid LLVM IR
Reason: unsupported dynamic function invocation (call to unsafe_load(p::CUDAnative.DevicePtr{T,A}, i::Integer, ::Val{align}) where {T, A, align} in CUDAnative at /data/.julia/packages/CUDAnative/ytV2j/src/device/pointer.jl:132)
Stacktrace:
[1] getindex at /data/.julia/packages/CUDAnative/ytV2j/src/device/array.jl:78
[2] bgetindex at /data/.julia/packages/GPUArrays/pJw1Y/src/indexing.jl:91
[3] macro expansion at /data/.julia/packages/GPUArrays/pJw1Y/src/indexing.jl:101
[4] setindex_kernel! at /data/.julia/packages/GPUArrays/pJw1Y/src/indexing.jl:95
Reason: unsupported call to the Julia runtime (call to jl_type_error)
Stacktrace:
[1] getindex at /data/.julia/packages/CUDAnative/ytV2j/src/device/array.jl:78
[2] bgetindex at /data/.julia/packages/GPUArrays/pJw1Y/src/indexing.jl:91
[3] macro expansion at /data/.julia/packages/GPUArrays/pJw1Y/src/indexing.jl:101
[4] setindex_kernel! at /data/.julia/packages/GPUArrays/pJw1Y/src/indexing.jl:95
Reason: unsupported dynamic function invocation (call to unsafe_store!(p::CUDAnative.DevicePtr{T,A}, x, i::Integer, ::Val{align}) where {T, A, align} in CUDAnative at /data/.julia/packages/CUDAnative/ytV2j/src/device/pointer.jl:167)
Stacktrace:
[1] setindex! at /data/.julia/packages/CUDAnative/ytV2j/src/device/array.jl:84
[2] _setindex! at abstractarray.jl:1043
[3] setindex! at abstractarray.jl:1020
[4] macro expansion at /data/.julia/packages/GPUArrays/pJw1Y/src/indexing.jl:101
[5] setindex_kernel! at /data/.julia/packages/GPUArrays/pJw1Y/src/indexing.jl:95
Stacktrace:
[1] check_ir(::CUDAnative.CompilerJob, ::LLVM.Module) at /data/.julia/packages/CUDAnative/ytV2j/src/compiler/validation.jl:114
[2] macro expansion at /data/.julia/packages/TimerOutputs/7zSea/src/TimerOutput.jl:216 [inlined]
[3] #codegen#119(::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::Function, ::Symbol, ::CUDAnative.CompilerJob) at /data/.julia/packages/CUDAnative/ytV2j/src/compiler/driver.jl:186
[4] #codegen at /data/.julia/packages/CUDAnative/ytV2j/src/compiler/driver.jl:0 [inlined]
[5] #compile#118(::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::Function, ::Symbol, ::CUDAnative.CompilerJob) at /data/.julia/packages/CUDAnative/ytV2j/src/compiler/driver.jl:47
[6] #compile#117 at ./none:0 [inlined]
[7] #compile at ./none:0 [inlined] (repeats 2 times)
[8] macro expansion at /data/.julia/packages/CUDAnative/ytV2j/src/execution.jl:380 [inlined]
[9] #cufunction#159(::Nothing, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(CUDAnative.cufunction), ::typeof(GPUArrays.setindex_kernel!), ::Type{Tuple{CuArrays.CuKernelState,CUDAnative.CuDeviceArray{Tracker.TrackedReal{Float32},2,CUDAnative.AS.Global},CUDAnative.CuDeviceArray{Tracker.TrackedReal{Float32},1,CUDAnative.AS.Global},Tuple{Int64,Int64},Tuple{Int64,Base.OneTo{Int64}},Int64}}) at /data/.julia/packages/CUDAnative/ytV2j/src/execution.jl:348
[10] cufunction(::Function, ::Type) at /data/.julia/packages/CUDAnative/ytV2j/src/execution.jl:348
[11] macro expansion at /data/.julia/packages/CUDAnative/ytV2j/src/execution.jl:174 [inlined]
[12] macro expansion at ./gcutils.jl:87 [inlined]
[13] macro expansion at /data/.julia/packages/CUDAnative/ytV2j/src/execution.jl:171 [inlined]
[14] _gpu_call(::CuArrays.CuArrayBackend, ::Function, ::CuArray{Tracker.TrackedReal{Float32},2}, ::Tuple{CuArray{Tracker.TrackedReal{Float32},2},CuArray{Tracker.TrackedReal{Float32},1},Tuple{Int64,Int64},Tuple{Int64,Base.OneTo{Int64}},Int64}, ::Tuple{Tuple{Int64},Tuple{Int64}}) at /data/.julia/packages/CuArrays/PwSdF/src/gpuarray_interface.jl:59
[15] gpu_call at /data/.julia/packages/GPUArrays/pJw1Y/src/abstract_gpu_interface.jl:151 [inlined]
[16] _unsafe_setindex! at /data/.julia/packages/GPUArrays/pJw1Y/src/indexing.jl:122 [inlined]
[17] _setindex! at ./multidimensional.jl:684 [inlined]
[18] setindex!(::CuArray{Tracker.TrackedReal{Float32},2}, ::CuArray{Tracker.TrackedReal{Float32},1}, ::Int64, ::Base.OneTo{Int64}) at ./abstractarray.jl:1020
[19] concatenate_setindex!(::CuArray{Tracker.TrackedReal{Float32},2}, ::CuArray{Tracker.TrackedReal{Float32},1}, ::Int64, ::Vararg{Any,N} where N) at ./abstractarray.jl:2006
[20] #mapslices#109(::Int64, ::Function, ::getfield(Main, Symbol("##15#16")), ::TrackedArray{…,CuArray{Float32,2}}) at ./abstractarray.jl:1972
[21] #mapslices at ./none:0 [inlined]
[22] f(::TrackedArray{…,CuArray{Float32,2}}) at ./none:1
[23] gradient_(::Function, ::CuArray{Float32,2}) at /data/.julia/packages/Tracker/RRYy6/src/back.jl:90
[24] #gradient#24 at /data/.julia/packages/Tracker/RRYy6/src/back.jl:164 [inlined]
[25] gradient at /data/.julia/packages/Tracker/RRYy6/src/back.jl:164 [inlined]
[26] df(::CuArray{Float32,2}) at ./none:1
[27] top-level scope at none:0
This works, though:
f(x) = sum(map(t->sqrt(sum(t.*t)), x[i, :] for i=1:size(x,1)))
df(x)