NeuralPDE issue trying to use LuxAMDGPU

Hi everyone,

as a Julia newbie, I tried to solve a simple bvp with NeuralPDE
and GPUs.
Using LuxCUDA on nvidia hardware, everything was fine.
After switching to LuxAMDGPU (and AMD hardware)
the code did not work any more.
The first error (left half of the screenshot below) occurred at

ps = Lux.setup(rng, chain)[1] |> ComponentArray |> gpud .|> Float32

Omitting the .|> Float32 part I could ship around this problem
but a second error occurred (right half of the screenshot below)
when starting the solver with

res = Optimization.solve(...)

I would be grateful for any advice.
Regards,

Martin.

You should order the chain a bit differently here.

ps = Lux.setup(rng, chain)[1] |> gpud |> ComponentArray

the GPU device will anyways cast your elements to Float32.

Regarding the final error, I am not certain NeuralPDE supports (with proper testing) AMD GPUs but that is more of a @ChrisRackauckas question.

It should if Lux supports it, but I haven’t tested AMDGPUs on Lux enough to know what constructs are supported or not. It’s worth an issue and we can setup CI and all of that but I don’t think at this point anyone has thoroughly tested the combination.

Thanks a lot for your reply. Ordering the chain as suggested results (for both CUDA
and AMD) in an error

Scalar indexing is disallowed

since I have set CUDA/AMDGPU.allowscalar(false).
Removing this line I get

┌ Warning: Performing scalar indexing on task Task (runnable) @0x00007fa1266c33a0.
│ Invocation of getindex resulted in scalar indexing of a GPU array.
│ This is typically caused by calling an iterating implementation of a method.
│ Such implementations do not execute on the GPU, but very slowly on the CPU,
│ and therefore should be avoided.

Thanks a lot for your reply. Ordering the chain as suggested results (for both CUDA
and AMD) in an error

Sorry my bad, your ordering was correct. The only thing I would change is |> Float32 to one of the Utilities | LuxDL Docs. If the problem still persists, please open an issue.

It should if Lux supports it, but I haven’t tested AMDGPUs on Lux enough to know what constructs are supported or not.

So from my experience, most things just work because on the Lux end we do have dispatches to swap things correctly. Something that I have found to be a bit flaky is 1) broadcasting and 2) views of ROCArrays. If there is a long broadcast chain, that causes problems with GPU compilation. For views its almost always better to copy for AMDGPU till all the SubArray dispatches are ready on their end.

From the stacktrace (if you have the full stacktrace it might be more helpful) it seems like it is erroring in Optimization.jl even before it hits NeuralPDE or Lux

no problem. I changed |> Float32 to |> f32 but I still get an error (see trace below)

It should if Lux supports it, but I haven’t tested AMDGPUs on Lux enough to know what constructs are supported or not.

So from my experience, most things just work because on the Lux end we do have dispatches to swap things correctly. Something that I have found to be a bit flaky is 1) broadcasting and 2) views of ROCArrays. If there is a long broadcast chain, that causes problems with GPU compilation. For views its almost always better to copy for AMDGPU till all the SubArray dispatches are ready on their end.
[/quote]

So is there something I can change in my code to get it running?
I am new to Julia so that I am not familiar with most of the internals.


ERROR: LoadError: InvalidIRError: compiling MethodInstance for (::GPUArrays.var"#broadcast_kernel#38")(::AMDGPU.ROCKernelContext, ::ComponentVector{Float32, AMDGPU.Device.ROCDeviceVector{Float32, 1}, Tuple{Axis{(layer_1 = ViewAxis(1:60, Axis(weight = ViewAxis(1:30, ShapedAxis((30, 1))), bias = ViewAxis(31:60, ShapedAxis((30, 1))))), layer_2 = ViewAxis(61:990, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_3 = ViewAxis(991:1920, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_4 = ViewAxis(1921:2850, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_5 = ViewAxis(2851:3780, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_6 = ViewAxis(3781:3811, Axis(weight = ViewAxis(1:30, ShapedAxis((1, 30))), bias = ViewAxis(31:31, ShapedAxis((1, 1))))))}}}, ::Base.Broadcast.Broadcasted{AMDGPU.ROCArrayStyle{1, AMDGPU.Runtime.Mem.HIPBuffer}, Tuple{ComponentArrays.CombinedAxis{Axis{(layer_1 = ViewAxis(1:60, Axis(weight = ViewAxis(1:30, ShapedAxis((30, 1))), bias = ViewAxis(31:60, ShapedAxis((30, 1))))), layer_2 = ViewAxis(61:990, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_3 = ViewAxis(991:1920, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_4 = ViewAxis(1921:2850, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_5 = ViewAxis(2851:3780, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_6 = ViewAxis(3781:3811, Axis(weight = ViewAxis(1:30, ShapedAxis((1, 30))), bias = ViewAxis(31:31, ShapedAxis((1, 1))))))}, Base.OneTo{Int64}}}, typeof(|>), Tuple{Base.Broadcast.Extruded{ComponentVector{Float32, AMDGPU.Device.ROCDeviceVector{Float32, 1}, Tuple{Axis{(layer_1 = ViewAxis(1:60, Axis(weight = ViewAxis(1:30, ShapedAxis((30, 1))), bias = ViewAxis(31:60, ShapedAxis((30, 1))))), layer_2 = ViewAxis(61:990, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_3 = ViewAxis(991:1920, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_4 = ViewAxis(1921:2850, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_5 = ViewAxis(2851:3780, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_6 = ViewAxis(3781:3811, Axis(weight = ViewAxis(1:30, ShapedAxis((1, 30))), bias = ViewAxis(31:31, ShapedAxis((1, 1))))))}}}, Tuple{Bool}, Tuple{Int64}}, AMDGPU.ROCRefValue{typeof(f32)}}}, ::Int64) resulted in invalid LLVM IR
Reason: unsupported call to an unknown function (call to julia.new_gc_frame)
Stacktrace:
[1] CartesianIndices
@ ./multidimensional.jl:267
[2] macro expansion
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/device/indexing.jl:81
[3] broadcast_kernel
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/host/broadcast.jl:49
Reason: unsupported call to an unknown function (call to julia.push_gc_frame)
Stacktrace:
[1] CartesianIndices
@ ./multidimensional.jl:267
[2] macro expansion
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/device/indexing.jl:81
[3] broadcast_kernel
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/host/broadcast.jl:49
Reason: unsupported call to an unknown function (call to julia.get_gc_frame_slot)
Stacktrace:
[1] CartesianIndices
@ ./multidimensional.jl:267
[2] macro expansion
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/device/indexing.jl:81
[3] broadcast_kernel
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/host/broadcast.jl:49
Reason: unsupported dynamic function invocation (call to getindex(t::Tuple, i::Int64) @ Base tuple.jl:31)
Stacktrace:
[1] _getindex
@ ./broadcast.jl:705
[2] _broadcast_getindex
@ ./broadcast.jl:681
[3] #31
@ ./broadcast.jl:1118
[4] ntuple
@ ./ntuple.jl:48
[5] copy
@ ./broadcast.jl:1118
[6] materialize
@ ./broadcast.jl:903
[7] axes
@ ~/.julia/packages/ComponentArrays/OQPt7/src/array_interface.jl:9
[8] CartesianIndices
@ ./multidimensional.jl:267
[9] macro expansion
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/device/indexing.jl:81
[10] broadcast_kernel
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/host/broadcast.jl:49
Reason: unsupported call to an unknown function (call to jl_f_tuple)
Stacktrace:
[1] ntuple
@ ./ntuple.jl:48
[2] copy
@ ./broadcast.jl:1118
[3] materialize
@ ./broadcast.jl:903
[4] axes
@ ~/.julia/packages/ComponentArrays/OQPt7/src/array_interface.jl:9
[5] CartesianIndices
@ ./multidimensional.jl:267
[6] macro expansion
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/device/indexing.jl:81
[7] broadcast_kernel
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/host/broadcast.jl:49
Reason: unsupported call to an unknown function (call to julia.pop_gc_frame)
Stacktrace:
[1] CartesianIndices
@ ./multidimensional.jl:267
[2] macro expansion
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/device/indexing.jl:81
[3] broadcast_kernel
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/host/broadcast.jl:49
Reason: unsupported call to an unknown function (call to julia.new_gc_frame)
Reason: unsupported call to an unknown function (call to julia.push_gc_frame)
Reason: unsupported call to an unknown function (call to julia.pop_gc_frame)
Reason: unsupported call to an unknown function (call to julia.get_gc_frame_slot)
Reason: unsupported dynamic function invocation (call to broadcasted)
Stacktrace:
[1] CartesianIndices
@ ~/.julia/packages/ComponentArrays/OQPt7/src/axis.jl:205
[2] CartesianIndices
@ ./multidimensional.jl:267
[3] macro expansion
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/device/indexing.jl:81
[4] broadcast_kernel
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/host/broadcast.jl:49
Reason: unsupported dynamic function invocation (call to materialize)
Stacktrace:
[1] CartesianIndices
@ ~/.julia/packages/ComponentArrays/OQPt7/src/axis.jl:205
[2] CartesianIndices
@ ./multidimensional.jl:267
[3] macro expansion
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/device/indexing.jl:81
[4] broadcast_kernel
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/host/broadcast.jl:49
Reason: unsupported dynamic function invocation (call to CartesianIndices)
Stacktrace:
[1] CartesianIndices
@ ~/.julia/packages/ComponentArrays/OQPt7/src/axis.jl:205
[2] CartesianIndices
@ ./multidimensional.jl:267
[3] macro expansion
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/device/indexing.jl:81
[4] broadcast_kernel
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/host/broadcast.jl:49
Reason: unsupported dynamic function invocation (call to getindex)
Stacktrace:
[1] macro expansion
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/device/indexing.jl:81
[2] broadcast_kernel
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/host/broadcast.jl:49
Reason: unsupported dynamic function invocation (call to getindex)
Stacktrace:
[1] broadcast_kernel
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/host/broadcast.jl:50
Reason: unsupported dynamic function invocation (call to setindex!)
Stacktrace:
[1] broadcast_kernel
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/host/broadcast.jl:50
Hint: catch this exception as err and call code_typed(err; interactive = true) to introspect the erronous code with Cthulhu.jl
Stacktrace:
[1] check_ir(job::GPUCompiler.CompilerJob{GPUCompiler.GCNCompilerTarget, AMDGPU.Compiler.HIPCompilerParams}, args::LLVM.Module)
@ GPUCompiler ~/.julia/packages/GPUCompiler/U36Ed/src/validation.jl:147
[2] macro expansion
@ ~/.julia/packages/GPUCompiler/U36Ed/src/driver.jl:440 [inlined]
[3] macro expansion
@ ~/.julia/packages/TimerOutputs/RsWnF/src/TimerOutput.jl:253 [inlined]
[4] macro expansion
@ ~/.julia/packages/GPUCompiler/U36Ed/src/driver.jl:439 [inlined]
[5] emit_llvm(job::GPUCompiler.CompilerJob; libraries::Bool, toplevel::Bool, optimize::Bool, cleanup::Bool, only_entry::Bool, validate::Bool)
@ GPUCompiler ~/.julia/packages/GPUCompiler/U36Ed/src/utils.jl:92
[6] emit_llvm
@ ~/.julia/packages/GPUCompiler/U36Ed/src/utils.jl:86 [inlined]
[7] codegen(output::Symbol, job::GPUCompiler.CompilerJob; libraries::Bool, toplevel::Bool, optimize::Bool, cleanup::Bool, strip::Bool, validate::Bool, only_entry::Bool, parent_job::Nothing)
@ GPUCompiler ~/.julia/packages/GPUCompiler/U36Ed/src/driver.jl:129
[8] codegen
@ ~/.julia/packages/GPUCompiler/U36Ed/src/driver.jl:110 [inlined]
[9] compile(target::Symbol, job::GPUCompiler.CompilerJob; libraries::Bool, toplevel::Bool, optimize::Bool, cleanup::Bool, strip::Bool, validate::Bool, only_entry::Bool)
@ GPUCompiler ~/.julia/packages/GPUCompiler/U36Ed/src/driver.jl:106
[10] compile
@ ~/.julia/packages/GPUCompiler/U36Ed/src/driver.jl:98 [inlined]
[11] #40
@ ~/.julia/packages/AMDGPU/kBMLx/src/compiler/codegen.jl:140 [inlined]
[12] JuliaContext(f::AMDGPU.Compiler.var"#40#41"{GPUCompiler.CompilerJob{GPUCompiler.GCNCompilerTarget, AMDGPU.Compiler.HIPCompilerParams}})
@ GPUCompiler ~/.julia/packages/GPUCompiler/U36Ed/src/driver.jl:47
[13] hipcompile(job::GPUCompiler.CompilerJob)
@ AMDGPU.Compiler ~/.julia/packages/AMDGPU/kBMLx/src/compiler/codegen.jl:139
[14] actual_compilation(cache::Dict{Any, AMDGPU.HIP.HIPFunction}, src::Core.MethodInstance, world::UInt64, cfg::GPUCompiler.CompilerConfig{GPUCompiler.GCNCompilerTarget, AMDGPU.Compiler.HIPCompilerParams}, compiler::typeof(AMDGPU.Compiler.hipcompile), linker::typeof(AMDGPU.Compiler.hiplink))
@ GPUCompiler ~/.julia/packages/GPUCompiler/U36Ed/src/execution.jl:125
[15] cached_compilation(cache::Dict{Any, AMDGPU.HIP.HIPFunction}, src::Core.MethodInstance, cfg::GPUCompiler.CompilerConfig{GPUCompiler.GCNCompilerTarget, AMDGPU.Compiler.HIPCompilerParams}, compiler::Function, linker::Function)
@ GPUCompiler ~/.julia/packages/GPUCompiler/U36Ed/src/execution.jl:103
[16] macro expansion
@ ~/.julia/packages/AMDGPU/kBMLx/src/compiler/codegen.jl:107 [inlined]
[17] macro expansion
@ ./lock.jl:267 [inlined]
[18] hipfunction(f::GPUArrays.var"#broadcast_kernel#38", tt::Type{Tuple{AMDGPU.ROCKernelContext, ComponentVector{Float32, AMDGPU.Device.ROCDeviceVector{Float32, 1}, Tuple{Axis{(layer_1 = ViewAxis(1:60, Axis(weight = ViewAxis(1:30, ShapedAxis((30, 1))), bias = ViewAxis(31:60, ShapedAxis((30, 1))))), layer_2 = ViewAxis(61:990, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_3 = ViewAxis(991:1920, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_4 = ViewAxis(1921:2850, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_5 = ViewAxis(2851:3780, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_6 = ViewAxis(3781:3811, Axis(weight = ViewAxis(1:30, ShapedAxis((1, 30))), bias = ViewAxis(31:31, ShapedAxis((1, 1))))))}}}, Base.Broadcast.Broadcasted{AMDGPU.ROCArrayStyle{1, AMDGPU.Runtime.Mem.HIPBuffer}, Tuple{ComponentArrays.CombinedAxis{Axis{(layer_1 = ViewAxis(1:60, Axis(weight = ViewAxis(1:30, ShapedAxis((30, 1))), bias = ViewAxis(31:60, ShapedAxis((30, 1))))), layer_2 = ViewAxis(61:990, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_3 = ViewAxis(991:1920, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_4 = ViewAxis(1921:2850, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_5 = ViewAxis(2851:3780, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_6 = ViewAxis(3781:3811, Axis(weight = ViewAxis(1:30, ShapedAxis((1, 30))), bias = ViewAxis(31:31, ShapedAxis((1, 1))))))}, Base.OneTo{Int64}}}, typeof(|>), Tuple{Base.Broadcast.Extruded{ComponentVector{Float32, AMDGPU.Device.ROCDeviceVector{Float32, 1}, Tuple{Axis{(layer_1 = ViewAxis(1:60, Axis(weight = ViewAxis(1:30, ShapedAxis((30, 1))), bias = ViewAxis(31:60, ShapedAxis((30, 1))))), layer_2 = ViewAxis(61:990, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_3 = ViewAxis(991:1920, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_4 = ViewAxis(1921:2850, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_5 = ViewAxis(2851:3780, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_6 = ViewAxis(3781:3811, Axis(weight = ViewAxis(1:30, ShapedAxis((1, 30))), bias = ViewAxis(31:31, ShapedAxis((1, 1))))))}}}, Tuple{Bool}, Tuple{Int64}}, AMDGPU.ROCRefValue{typeof(f32)}}}, Int64}}; kwargs::@Kwargs{name::Nothing})
@ AMDGPU.Compiler ~/.julia/packages/AMDGPU/kBMLx/src/compiler/codegen.jl:101
[19] hipfunction
@ ~/.julia/packages/AMDGPU/kBMLx/src/compiler/codegen.jl:100 [inlined]
[20] macro expansion
@ ~/.julia/packages/AMDGPU/kBMLx/src/highlevel.jl:157 [inlined]
[21] #gpu_call#48
@ ~/.julia/packages/AMDGPU/kBMLx/src/gpuarrays.jl:8 [inlined]
[22] gpu_call
@ ~/.julia/packages/AMDGPU/kBMLx/src/gpuarrays.jl:5 [inlined]
[23] gpu_call(::GPUArrays.var"#broadcast_kernel#38", ::ComponentVector{Float32, ROCArray{Float32, 1, AMDGPU.Runtime.Mem.HIPBuffer}, Tuple{Axis{(layer_1 = ViewAxis(1:60, Axis(weight = ViewAxis(1:30, ShapedAxis((30, 1))), bias = ViewAxis(31:60, ShapedAxis((30, 1))))), layer_2 = ViewAxis(61:990, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_3 = ViewAxis(991:1920, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_4 = ViewAxis(1921:2850, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_5 = ViewAxis(2851:3780, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_6 = ViewAxis(3781:3811, Axis(weight = ViewAxis(1:30, ShapedAxis((1, 30))), bias = ViewAxis(31:31, ShapedAxis((1, 1))))))}}}, ::Base.Broadcast.Broadcasted{AMDGPU.ROCArrayStyle{1, AMDGPU.Runtime.Mem.HIPBuffer}, Tuple{ComponentArrays.CombinedAxis{Axis{(layer_1 = ViewAxis(1:60, Axis(weight = ViewAxis(1:30, ShapedAxis((30, 1))), bias = ViewAxis(31:60, ShapedAxis((30, 1))))), layer_2 = ViewAxis(61:990, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_3 = ViewAxis(991:1920, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_4 = ViewAxis(1921:2850, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_5 = ViewAxis(2851:3780, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_6 = ViewAxis(3781:3811, Axis(weight = ViewAxis(1:30, ShapedAxis((1, 30))), bias = ViewAxis(31:31, ShapedAxis((1, 1))))))}, Base.OneTo{Int64}}}, typeof(|>), Tuple{Base.Broadcast.Extruded{ComponentVector{Float32, ROCArray{Float32, 1, AMDGPU.Runtime.Mem.HIPBuffer}, Tuple{Axis{(layer_1 = ViewAxis(1:60, Axis(weight = ViewAxis(1:30, ShapedAxis((30, 1))), bias = ViewAxis(31:60, ShapedAxis((30, 1))))), layer_2 = ViewAxis(61:990, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_3 = ViewAxis(991:1920, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_4 = ViewAxis(1921:2850, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_5 = ViewAxis(2851:3780, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_6 = ViewAxis(3781:3811, Axis(weight = ViewAxis(1:30, ShapedAxis((1, 30))), bias = ViewAxis(31:31, ShapedAxis((1, 1))))))}}}, Tuple{Bool}, Tuple{Int64}}, Base.RefValue{typeof(f32)}}}, ::Int64; target::ComponentVector{Float32, ROCArray{Float32, 1, AMDGPU.Runtime.Mem.HIPBuffer}, Tuple{Axis{(layer_1 = ViewAxis(1:60, Axis(weight = ViewAxis(1:30, ShapedAxis((30, 1))), bias = ViewAxis(31:60, ShapedAxis((30, 1))))), layer_2 = ViewAxis(61:990, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_3 = ViewAxis(991:1920, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_4 = ViewAxis(1921:2850, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_5 = ViewAxis(2851:3780, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_6 = ViewAxis(3781:3811, Axis(weight = ViewAxis(1:30, ShapedAxis((1, 30))), bias = ViewAxis(31:31, ShapedAxis((1, 1))))))}}}, elements::Nothing, threads::Int64, blocks::Int64, name::Nothing)
@ GPUArrays ~/.julia/packages/GPUArrays/Hd5Sk/src/device/execution.jl:69
[24] gpu_call
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/device/execution.jl:34 [inlined]
[25] _copyto!
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/host/broadcast.jl:60 [inlined]
[26] copyto!
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/host/broadcast.jl:37 [inlined]
[27] copy
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/host/broadcast.jl:28 [inlined]
[28] materialize(bc::Base.Broadcast.Broadcasted{AMDGPU.ROCArrayStyle{1, AMDGPU.Runtime.Mem.HIPBuffer}, Nothing, typeof(|>), Tuple{ComponentVector{Float32, ROCArray{Float32, 1, AMDGPU.Runtime.Mem.HIPBuffer}, Tuple{Axis{(layer_1 = ViewAxis(1:60, Axis(weight = ViewAxis(1:30, ShapedAxis((30, 1))), bias = ViewAxis(31:60, ShapedAxis((30, 1))))), layer_2 = ViewAxis(61:990, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_3 = ViewAxis(991:1920, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_4 = ViewAxis(1921:2850, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_5 = ViewAxis(2851:3780, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_6 = ViewAxis(3781:3811, Axis(weight = ViewAxis(1:30, ShapedAxis((1, 30))), bias = ViewAxis(31:31, ShapedAxis((1, 1))))))}}}, Base.RefValue{typeof(f32)}}})
@ Base.Broadcast ./broadcast.jl:903
[29] top-level scope
@ ~/Desktop/2024_SS/Abschlussarbeiten/Julia/neuralpde_cuda_rocm/neuralpde_amd_error_1b.jl:62
in expression starting at /home/users/mre/Desktop/2024_SS/Abschlussarbeiten/Julia/neuralpde_cuda_rocm/neuralpde_amd_error_1b.jl:62
true

Below you’ll find the stack trace for the second error. I had to cut it
down a little bit (the lines end with …) because it was too large.


ERROR: LoadError: InvalidIRError: compiling MethodInstance for (::GPUArrays.var"#broadcast_kernel#38" …
Reason: unsupported call to an unknown function (call to julia.new_gc_frame)
Stacktrace:
[1] CartesianIndices
@ ./multidimensional.jl:267
[2] macro expansion
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/device/indexing.jl:81
[3] broadcast_kernel
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/host/broadcast.jl:49
Reason: unsupported call to an unknown function (call to julia.push_gc_frame)
Stacktrace:
[1] CartesianIndices
@ ./multidimensional.jl:267
[2] macro expansion
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/device/indexing.jl:81
[3] broadcast_kernel
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/host/broadcast.jl:49
Reason: unsupported call to an unknown function (call to julia.get_gc_frame_slot)
Stacktrace:
[1] CartesianIndices
@ ./multidimensional.jl:267
[2] macro expansion
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/device/indexing.jl:81
[3] broadcast_kernel
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/host/broadcast.jl:49
Reason: unsupported dynamic function invocation (call to getindex(t::Tuple, i::Int64) @ Base tuple.jl:31)
Stacktrace:
[1] _getindex
@ ./broadcast.jl:705
[2] _broadcast_getindex
@ ./broadcast.jl:681
[3] #31
@ ./broadcast.jl:1118
[4] ntuple
@ ./ntuple.jl:48
[5] copy
@ ./broadcast.jl:1118
[6] materialize
@ ./broadcast.jl:903
[7] axes
@ ~/.julia/packages/ComponentArrays/OQPt7/src/array_interface.jl:9
[8] CartesianIndices
@ ./multidimensional.jl:267
[9] macro expansion
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/device/indexing.jl:81
[10] broadcast_kernel
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/host/broadcast.jl:49
Reason: unsupported call to an unknown function (call to jl_f_tuple)
Stacktrace:
[1] ntuple
@ ./ntuple.jl:48
[2] copy
@ ./broadcast.jl:1118
[3] materialize
@ ./broadcast.jl:903
[4] axes
@ ~/.julia/packages/ComponentArrays/OQPt7/src/array_interface.jl:9
[5] CartesianIndices
@ ./multidimensional.jl:267
[6] macro expansion
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/device/indexing.jl:81
[7] broadcast_kernel
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/host/broadcast.jl:49
Reason: unsupported call to an unknown function (call to julia.pop_gc_frame)
Stacktrace:
[1] CartesianIndices
@ ./multidimensional.jl:267
[2] macro expansion
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/device/indexing.jl:81
[3] broadcast_kernel
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/host/broadcast.jl:49
Reason: unsupported call to an unknown function (call to julia.new_gc_frame)
Reason: unsupported call to an unknown function (call to julia.push_gc_frame)
Reason: unsupported call to an unknown function (call to julia.pop_gc_frame)
Reason: unsupported call to an unknown function (call to julia.get_gc_frame_slot)
Reason: unsupported dynamic function invocation (call to broadcasted)
Stacktrace:
[1] CartesianIndices
@ ~/.julia/packages/ComponentArrays/OQPt7/src/axis.jl:205
[2] CartesianIndices
@ ./multidimensional.jl:267
[3] macro expansion
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/device/indexing.jl:81
[4] broadcast_kernel
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/host/broadcast.jl:49
Reason: unsupported dynamic function invocation (call to materialize)
Stacktrace:
[1] CartesianIndices
@ ~/.julia/packages/ComponentArrays/OQPt7/src/axis.jl:205
[2] CartesianIndices
@ ./multidimensional.jl:267
[3] macro expansion
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/device/indexing.jl:81
[4] broadcast_kernel
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/host/broadcast.jl:49
Reason: unsupported dynamic function invocation (call to CartesianIndices)
Stacktrace:
[1] CartesianIndices
@ ~/.julia/packages/ComponentArrays/OQPt7/src/axis.jl:205
[2] CartesianIndices
@ ./multidimensional.jl:267
[3] macro expansion
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/device/indexing.jl:81
[4] broadcast_kernel
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/host/broadcast.jl:49
Reason: unsupported dynamic function invocation (call to getindex)
Stacktrace:
[1] macro expansion
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/device/indexing.jl:81
[2] broadcast_kernel
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/host/broadcast.jl:49
Reason: unsupported dynamic function invocation (call to getindex)
Stacktrace:
[1] broadcast_kernel
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/host/broadcast.jl:50
Reason: unsupported dynamic function invocation (call to setindex!)
Stacktrace:
[1] broadcast_kernel
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/host/broadcast.jl:50
Hint: catch this exception as err and call code_typed(err; interactive = true) to introspect the erronous code with Cthulhu.jl
Stacktrace:
[1] check_ir(job::GPUCompiler.CompilerJob{GPUCompiler.GCNCompilerTarget, AMDGPU.Compiler.HIPCompilerParams}, args::LLVM.Module)
@ GPUCompiler ~/.julia/packages/GPUCompiler/U36Ed/src/validation.jl:147
[2] macro expansion
@ ~/.julia/packages/GPUCompiler/U36Ed/src/driver.jl:440 [inlined]
[3] macro expansion
@ ~/.julia/packages/TimerOutputs/RsWnF/src/TimerOutput.jl:253 [inlined]
[4] macro expansion
@ ~/.julia/packages/GPUCompiler/U36Ed/src/driver.jl:439 [inlined]
[5] emit_llvm(job::GPUCompiler.CompilerJob; libraries::Bool, toplevel::Bool, optimize::Bool, cleanup::Bool, only_entry::Bool, validate::Bool)
@ GPUCompiler ~/.julia/packages/GPUCompiler/U36Ed/src/utils.jl:92
[6] emit_llvm
@ ~/.julia/packages/GPUCompiler/U36Ed/src/utils.jl:86 [inlined]
[7] codegen(output::Symbol, job::GPUCompiler.CompilerJob; libraries::Bool, toplevel::Bool, optimize::Bool, cleanup::Bool, strip::Bool, validate::Bool, only_entry::Bool, parent_job::Nothing)
@ GPUCompiler ~/.julia/packages/GPUCompiler/U36Ed/src/driver.jl:129
[8] codegen
@ ~/.julia/packages/GPUCompiler/U36Ed/src/driver.jl:110 [inlined]
[9] compile(target::Symbol, job::GPUCompiler.CompilerJob; libraries::Bool, toplevel::Bool, optimize::Bool, cleanup::Bool, strip::Bool, validate::Bool, only_entry::Bool)
@ GPUCompiler ~/.julia/packages/GPUCompiler/U36Ed/src/driver.jl:106
[10] compile
@ ~/.julia/packages/GPUCompiler/U36Ed/src/driver.jl:98 [inlined]
[11] #40
@ ~/.julia/packages/AMDGPU/kBMLx/src/compiler/codegen.jl:140 [inlined]
[12] JuliaContext(f::AMDGPU.Compiler.var"#40#41"{GPUCompiler.CompilerJob{GPUCompiler.GCNCompilerTarget, AMDGPU.Compiler.HIPCompilerParams}})
@ GPUCompiler ~/.julia/packages/GPUCompiler/U36Ed/src/driver.jl:47
[13] hipcompile(job::GPUCompiler.CompilerJob)
@ AMDGPU.Compiler ~/.julia/packages/AMDGPU/kBMLx/src/compiler/codegen.jl:139
[14] actual_compilation(cache::Dict{Any, AMDGPU.HIP.HIPFunction}, src::Core.MethodInstance, world::UInt64, cfg::GPUCompiler.CompilerConfig{GPUCompiler.GCNCompilerTarget, AMDGPU.Compiler.HIPCompilerParams}, compiler::typeof(AMDGPU.Compiler.hipcompile), linker::typeof(AMDGPU.Compiler.hiplink))
@ GPUCompiler ~/.julia/packages/GPUCompiler/U36Ed/src/execution.jl:125
[15] cached_compilation(cache::Dict{Any, AMDGPU.HIP.HIPFunction}, src::Core.MethodInstance, cfg::GPUCompiler.CompilerConfig{GPUCompiler.GCNCompilerTarget, AMDGPU.Compiler.HIPCompilerParams}, compiler::Function, linker::Function)
@ GPUCompiler ~/.julia/packages/GPUCompiler/U36Ed/src/execution.jl:103
[16] macro expansion
@ ~/.julia/packages/AMDGPU/kBMLx/src/compiler/codegen.jl:107 [inlined]
[17] macro expansion
@ ./lock.jl:267 [inlined]
[18] hipfunction(f::GPUArrays.var"#broadcast_kernel#38", tt::Type{Tuple{AMDGPU.ROCKernelContext, ComponentVector{Float32, AMDGPU.Device.ROCDeviceVector{Float32, 1 …}
@ AMDGPU.Compiler ~/.julia/packages/AMDGPU/kBMLx/src/compiler/codegen.jl:101
[19] hipfunction
@ ~/.julia/packages/AMDGPU/kBMLx/src/compiler/codegen.jl:100 [inlined]
[20] macro expansion
@ ~/.julia/packages/AMDGPU/kBMLx/src/highlevel.jl:157 [inlined]
[21] #gpu_call#48
@ ~/.julia/packages/AMDGPU/kBMLx/src/gpuarrays.jl:8 [inlined]
[22] gpu_call
@ ~/.julia/packages/AMDGPU/kBMLx/src/gpuarrays.jl:5 [inlined]
[23] gpu_call(::GPUArrays.var"#broadcast_kernel#38", ::ComponentVector{Float32, ROCArray{Float32, 1, AMDGPU.Runtime.Mem.HIPBuffer},…
@ GPUArrays ~/.julia/packages/GPUArrays/Hd5Sk/src/device/execution.jl:69
[24] gpu_call
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/device/execution.jl:34 [inlined]
[25] _copyto!
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/host/broadcast.jl:60 [inlined]
[26] materialize!
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/host/broadcast.jl:32 [inlined]
[27] materialize!(dest::ComponentVector{Float32, ROCArray{Float32, 1, AMDGPU.Runtime.Mem.HIPBuffer},…
@ Base.Broadcast ./broadcast.jl:911
[28] (::OptimizationZygoteExt.var"#38#56"{OptimizationZygoteExt.var"#37#55"{OptimizationFunction{true, AutoZygote, NeuralPDE.var"#full_loss_function#344"{NeuralPDE.var"#null_nonadaptive_loss#147",…
@ OptimizationZygoteExt ~/.julia/packages/Optimization/Zc00b/ext/OptimizationZygoteExt.jl:95
[29] macro expansion
@ ~/.julia/packages/OptimizationOptimisers/AOkbT/src/OptimizationOptimisers.jl:68 [inlined]
[30] macro expansion
@ ~/.julia/packages/Optimization/Zc00b/src/utils.jl:41 [inlined]
[31] __solve(cache::OptimizationCache{OptimizationFunction{true, AutoZygote, NeuralPDE.var"#full_loss_function#344"{NeuralPDE.var"#null_nonadaptive_loss#147", …
@ OptimizationOptimisers ~/.julia/packages/OptimizationOptimisers/AOkbT/src/OptimizationOptimisers.jl:66
[32] solve!(cache::OptimizationCache{OptimizationFunction{true, AutoZygote, NeuralPDE.var"#full_loss_function#344"{NeuralPDE.var"#null_nonadaptive_loss#147", …
@ SciMLBase ~/.julia/packages/SciMLBase/Dwomw/src/solve.jl:180
[33] solve(::OptimizationProblem{true, OptimizationFunction{true, AutoZygote, NeuralPDE.var"#full_loss_function#344"{NeuralPDE.var"#null_nonadaptive_loss#147", …
@ SciMLBase ~/.julia/packages/SciMLBase/Dwomw/src/solve.jl:96
[34] top-level scope
@ ~/Desktop/2024_SS/Abschlussarbeiten/Julia/neuralpde_cuda_rocm/neuralpde_amd_error_2b.jl:74
in expression starting at /home/users/mre/Desktop/2024_SS/Abschlussarbeiten/Julia/neuralpde_cuda_rocm/neuralpde_amd_error_2b.jl:74
true

Ah. Can you open an issue in LuxDL/Lux.jl for the f32? It is not hard to fix, but without an issue I will most likely miss it.

The other error that you get is actually from Optimization.jl Optimization.jl/ext/OptimizationZygoteExt.jl at 038c7b6bd34111cd4a2fdb74886bf7c05a026c30 · SciML/Optimization.jl · GitHub. Which is a somewhat good thing because it means AMDGPU is working for your system. But we need to patch the copyto! function in ComponentArrays.

As a simple test, can you try doing ps .= ps to see if this throws an error? (I would test it but don’t have access to a amdgpu rn)

no problem

Adding the line ps .= ps resulted in the following error:

InvalidIRError: compiling MethodInstance for (::GPUArrays.var"#broadcast_kernel#38")(::AMDGPU.ROCKernelContext, ::ComponentVector{Float32, AMDGPU.Device.ROCDeviceVector{Float32, 1}, Tuple{Axis{(layer_1 = ViewAxis(1:60, Axis(weight = ViewAxis(1:30, ShapedAxis((30, 1))), bias = ViewAxis(31:60, ShapedAxis((30, 1))))), layer_2 = ViewAxis(61:990, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_3 = ViewAxis(991:1920, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_4 = ViewAxis(1921:2850, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_5 = ViewAxis(2851:3780, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_6 = ViewAxis(3781:3811, Axis(weight = ViewAxis(1:30, ShapedAxis((1, 30))), bias = ViewAxis(31:31, ShapedAxis((1, 1))))))}}}, ::Base.Broadcast.Broadcasted{AMDGPU.ROCArrayStyle{1, AMDGPU.Runtime.Mem.HIPBuffer}, Tuple{ComponentArrays.CombinedAxis{Axis{(layer_1 = ViewAxis(1:60, Axis(weight = ViewAxis(1:30, ShapedAxis((30, 1))), bias = ViewAxis(31:60, ShapedAxis((30, 1))))), layer_2 = ViewAxis(61:990, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_3 = ViewAxis(991:1920, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_4 = ViewAxis(1921:2850, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_5 = ViewAxis(2851:3780, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_6 = ViewAxis(3781:3811, Axis(weight = ViewAxis(1:30, ShapedAxis((1, 30))), bias = ViewAxis(31:31, ShapedAxis((1, 1))))))}, Base.OneTo{Int64}}}, typeof(identity), Tuple{Base.Broadcast.Extruded{ComponentVector{Float32, AMDGPU.Device.ROCDeviceVector{Float32, 1}, Tuple{Axis{(layer_1 = ViewAxis(1:60, Axis(weight = ViewAxis(1:30, ShapedAxis((30, 1))), bias = ViewAxis(31:60, ShapedAxis((30, 1))))), layer_2 = ViewAxis(61:990, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_3 = ViewAxis(991:1920, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_4 = ViewAxis(1921:2850, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_5 = ViewAxis(2851:3780, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_6 = ViewAxis(3781:3811, Axis(weight = ViewAxis(1:30, ShapedAxis((1, 30))), bias = ViewAxis(31:31, ShapedAxis((1, 1))))))}}}, Tuple{Bool}, Tuple{Int64}}}}, ::Int64) resulted in invalid LLVM IR

@Vaibhavdixit02 is it possible to replace Optimization.jl/ext/OptimizationZygoteExt.jl at 038c7b6bd34111cd4a2fdb74886bf7c05a026c30 · SciML/Optimization.jl · GitHub with a direct copyto! instead of broadcasting?

I replaced the broadcasting by a copyto!. The type casting throws again an error (see below). So I cancelled the casting. The optimizer now stops with the message

MethodError: no method matching copyto!(::ComponentVector{Float32, ROCArray{Float32, 1, AMDGPU.Runtime.Mem.HIPBuffer}, Tuple{Axis{(layer_1 = ViewAxis(1:60, Axis(weight = ViewAxis(1:30, ShapedAxis((30, 1))), bias = ViewAxis(31:60, ShapedAxis((30, 1))))), layer_2 = ViewAxis(61:990, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_3 = ViewAxis(991:1920, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_4 = ViewAxis(1921:2850, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_5 = ViewAxis(2851:3780, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_6 = ViewAxis(3781:3811, Axis(weight = ViewAxis(1:30, ShapedAxis((1, 30))), bias = ViewAxis(31:31, ShapedAxis((1, 1))))))}}})

Closest candidates are:
copyto!(::AbstractArray, ::Base.Broadcast.Broadcasted{<:GPUArraysCore.AbstractGPUArrayStyle})
@ GPUArrays ~/.julia/packages/GPUArrays/Hd5Sk/src/host/broadcast.jl:37
copyto!(::AbstractArray, ::Base.Broadcast.Broadcasted{<:StaticArraysCore.StaticArrayStyle})
@ StaticArrays ~/.julia/packages/StaticArrays/EHHaF/src/broadcast.jl:63
copyto!(::ComponentArray, ::ComponentArray)
@ ComponentArrays ~/.julia/packages/ComponentArrays/OQPt7/src/similar_convert_copy.jl:41


InvalidIRError: compiling MethodInstance for (::GPUArrays.var"#broadcast_kernel#38")(::AMDGPU.ROCKernelContext, ::ComponentVector{Float32, AMDGPU.Device.ROCDeviceVector{Float32, 1}, Tuple{Axis{(layer_1 = ViewAxis(1:60, Axis(weight = ViewAxis(1:30, ShapedAxis((30, 1))), bias = ViewAxis(31:60, ShapedAxis((30, 1))))), layer_2 = ViewAxis(61:990, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_3 = ViewAxis(991:1920, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_4 = ViewAxis(1921:2850, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_5 = ViewAxis(2851:3780, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_6 = ViewAxis(3781:3811, Axis(weight = ViewAxis(1:30, ShapedAxis((1, 30))), bias = ViewAxis(31:31, ShapedAxis((1, 1))))))}}}, ::Base.Broadcast.Broadcasted{AMDGPU.ROCArrayStyle{1, AMDGPU.Runtime.Mem.HIPBuffer}, Tuple{ComponentArrays.CombinedAxis{Axis{(layer_1 = ViewAxis(1:60, Axis(weight = ViewAxis(1:30, ShapedAxis((30, 1))), bias = ViewAxis(31:60, ShapedAxis((30, 1))))), layer_2 = ViewAxis(61:990, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_3 = ViewAxis(991:1920, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_4 = ViewAxis(1921:2850, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_5 = ViewAxis(2851:3780, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_6 = ViewAxis(3781:3811, Axis(weight = ViewAxis(1:30, ShapedAxis((1, 30))), bias = ViewAxis(31:31, ShapedAxis((1, 1))))))}, Base.OneTo{Int64}}}, typeof(|>), Tuple{Base.Broadcast.Extruded{ComponentVector{Float32, AMDGPU.Device.ROCDeviceVector{Float32, 1}, Tuple{Axis{(layer_1 = ViewAxis(1:60, Axis(weight = ViewAxis(1:30, ShapedAxis((30, 1))), bias = ViewAxis(31:60, ShapedAxis((30, 1))))), layer_2 = ViewAxis(61:990, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_3 = ViewAxis(991:1920, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_4 = ViewAxis(1921:2850, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_5 = ViewAxis(2851:3780, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_6 = ViewAxis(3781:3811, Axis(weight = ViewAxis(1:30, ShapedAxis((1, 30))), bias = ViewAxis(31:31, ShapedAxis((1, 1))))))}}}, Tuple{Bool}, Tuple{Int64}}, AMDGPU.ROCRefValue{typeof(f32)}}}, ::Int64) resulted in invalid LLVM IR
Reason: unsupported call to an unknown function (call to julia.new_gc_frame)
Reason: unsupported call to an unknown function (call to julia.push_gc_frame)
Reason: unsupported call to an unknown function (call to julia.pop_gc_frame)
Reason: unsupported call to an unknown function (call to julia.get_gc_frame_slot)
Reason: unsupported dynamic function invocation (call to broadcasted)
Stacktrace:
[1] CartesianIndices
@ ~/.julia/packages/ComponentArrays/OQPt7/src/axis.jl:205
[2] CartesianIndices
@ ./multidimensional.jl:267
[3] macro expansion
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/device/indexing.jl:81
[4] broadcast_kernel
@ ~/.julia/packages/GPUArrays/Hd5Sk/src/host/broadcast.jl:49
Reason: unsupported dynamic function invocation (call to materialize)
Stacktrace:

Hi Avik,

ps = Lux.setup(rng, chain)[1] |> ComponentArray |> gpud |> f*

works as expected (thanks once again), but Optimization.solve still
throws a copyto! error:

MethodError: no method matching copyto!(::ComponentVector{Float32, ROCArray{Float32, 1, AMDGPU.Runtime.Mem.HIPBuffer}, Tuple{Axis{(layer_1 = ViewAxis(1:60, Axis(weight = ViewAxis(1:30, ShapedAxis((30, 1))), bias = ViewAxis(31:60, ShapedAxis((30, 1))))), layer_2 = ViewAxis(61:990, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_3 = ViewAxis(991:1920, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_4 = ViewAxis(1921:2850, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_5 = ViewAxis(2851:3780, Axis(weight = ViewAxis(1:900, ShapedAxis((30, 30))), bias = ViewAxis(901:930, ShapedAxis((30, 1))))), layer_6 = ViewAxis(3781:3811, Axis(weight = ViewAxis(1:30, ShapedAxis((1, 30))), bias = ViewAxis(31:31, ShapedAxis((1, 1))))))}}})

Closest candidates are:
copyto!(::AbstractArray, ::Base.Broadcast.Broadcasted{<:GPUArraysCore.AbstractGPUArrayStyle})
@ GPUArrays ~/.julia/packages/GPUArrays/Hd5Sk/src/host/broadcast.jl:37
copyto!(::AbstractArray, ::Base.Broadcast.Broadcasted{<:StaticArraysCore.StaticArrayStyle})
@ StaticArrays ~/.julia/packages/StaticArrays/EHHaF/src/broadcast.jl:63
copyto!(::ComponentArray, ::ComponentArray)
@ ComponentArrays ~/.julia/packages/ComponentArrays/OQPt7/src/similar_convert_copy.jl:41

Stacktrace:
[1] (::OptimizationZygoteExt.var"#38#56"{OptimizationZygoteExt.var"#37#55"{OptimizationFunction{true, AutoZygote, NeuralPDE.var"#full_loss_function#344"{NeuralPDE.var"#null_nonadaptive_loss#147", Vector{NeuralPDE.var"#103#104"{NeuralPDE.var"#240#241"{RuntimeGeneratedFunctions.RuntimeGeneratedFunction{(:cord, Symbol(“##226”), :phi, :derivative, :integral, :u, :p), NeuralPDE.var"#_RGF_ModTag", NeuralPDE.var"#_RGF_ModTag", (0x242931ff, 0x09318e41, 0x527aa26e, 0xa956fc9a, 0x609538a5), Expr}, …

Can you open an issue with your mwe?

Just opened an issue for NeuralPDE.jl .

Yeah sure, I don’t know why I didn’t a notification for this in the email.