Perplexing behavior when computing the matmul smoke test on GPU in Julia

Aakhash_Sundaresan · December 1, 2023, 1:37pm

@avikpal , @ChrisRackauckas Why is that when I compute the matmul smoke test for the first time and time it, it takes long time than the CPU computation, and also the allocations are huge. When I again try to compute the same operation, this time it is fast? What is the reason behind this? I have attached the scrrenshot of what I’m doing, for you to have a look at it…Please explain it to me?

Thanks in advance

nilshg · December 1, 2023, 1:51pm

That’s just JIT compilation?

Jeff_Emanuel · December 1, 2023, 2:36pm

It’s described at the beginning of the abstract here Reducing Compilation Latency in the Julia Programming Language

Aakhash_Sundaresan · December 1, 2023, 2:47pm

Thanks for pointing me and helping me out…I have one more question, when I do a GPU based computation, I get this error…
ERROR: LoadError: GPU compilation of MethodInstance for (::GPUArrays.var"#broadcast_kernel#26")(::CUDA.CuKernelContext, ::CuDeviceMatrix{Float32, 1}, ::Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{2}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}, typeof(-), Tuple{Base.Broadcast.Extruded{Matrix{Float32}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}, Base.Broadcast.Extruded{LinearAlgebra.Adjoint{Float32, CuDeviceVector{Float32, 1}}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}}}, ::Int64) failed
KernelError: passing and using non-bitstype argument
Argument 4 to your kernel function is of type Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{2}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}, typeof(-), Tuple{Base.Broadcast.Extruded{Matrix{Float32}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}, Base.Broadcast.Extruded{LinearAlgebra.Adjoint{Float32, CuDeviceVector{Float32, 1}}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}}}, which is not isbits:
.args is of type Tuple{Base.Broadcast.Extruded{Matrix{Float32}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}, Base.Broadcast.Extruded{LinearAlgebra.Adjoint{Float32, CuDeviceVector{Float32, 1}}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}} which is not isbits.
.1 is of type Base.Broadcast.Extruded{Matrix{Float32}, Tuple{Bool, Bool}, Tuple{Int64, Int64}} which is not isbits.
.x is of type Matrix{Float32} which is not isbits.
Can someone help me out with this error? I know to trace the error, you need the actual code, but can someone explain the reason for this error and what type of error in general is this?

ChrisRackauckas · December 2, 2023, 3:03pm

What code is that error from? Your code above does not broadcast.

Aakhash_Sundaresan · December 3, 2023, 3:33am

The error is when I’m trying to use the Lux framework in the NeuralPDE PINN setup on the GPU…

Topic		Replies	Views
CUDA with IJulia results in unexpected errors GPU	7	633	June 18, 2021
GPU Compilation error when using Lux framework in NeuralPDE General Usage question	13	650	December 10, 2023
GPU Julia vs GPU Matlab New to Julia gpu	61	1034	November 18, 2024
Bug with Julia 1.7.1 and CUDA 3.3 GPU bug , cuda	26	2397	June 2, 2022
Why is my GPU kernel an order of magnitude slower than my CPU function? GPU question	8	225	June 4, 2025

Perplexing behavior when computing the matmul smoke test on GPU in Julia

Related topics