CUDA.jl mystery : VSCode + Julia extension works fine but commandline run fails

Lian_Yunlong · July 20, 2022, 10:18am

Hi,
I have encountered some mysterious issues with CUDA.jl. When I run the code from within the file (in VSCode + Julia extension), everything seems fine and I got the expected answer. But if I run the same code in command line, I got the following error:

  Activating project at `~/jianguoyun/Nutstore/RigorousCoupledWaveAnalysis.jl-master`
eigenmodes:
etm_propagate:
ERROR: LoadError: CUBLASError: an invalid value was used as an argument (code 7, CUBLAS_STATUS_INVALID_VALUE)
Stacktrace:
  [1] throw_api_error(res::CUDA.CUBLAS.cublasStatus_t)
    @ CUDA.CUBLAS ~/.julia/packages/CUDA/DfvRa/lib/cublas/error.jl:50
  [2] macro expansion
    @ ~/.julia/packages/CUDA/DfvRa/lib/cublas/error.jl:63 [inlined]
  [3] cublasZgemv_v2(handle::Ptr{Nothing}, trans::Char, m::Int64, n::Int64, alpha::Bool, A::CUDA.CuArray{ComplexF64, 2, CUDA.Mem.DeviceBuffer}, lda::Int64, x::CUDA.CuArray{ComplexF64, 1, CUDA.Mem.DeviceBuffer}, incx::Int64, beta::Bool, y::CUDA.CuArray{ComplexF64, 1, CUDA.Mem.DeviceBuffer}, incy::Int64)
    @ CUDA.CUBLAS ~/.julia/packages/CUDA/DfvRa/lib/utils/call.jl:26
  [4] gemv!
    @ ~/.julia/packages/CUDA/DfvRa/lib/cublas/wrappers.jl:331 [inlined]
  [5] gemv_dispatch!(Y::CUDA.CuArray{ComplexF64, 1, CUDA.Mem.DeviceBuffer}, A::CUDA.CuArray{ComplexF64, 2, CUDA.Mem.DeviceBuffer}, B::CUDA.CuArray{ComplexF64, 1, CUDA.Mem.DeviceBuffer}, alpha::Bool, beta::Bool)
    @ CUDA.CUBLAS ~/.julia/packages/CUDA/DfvRa/lib/cublas/linalg.jl:179
  [6] mul!
    @ ~/.julia/packages/CUDA/DfvRa/lib/cublas/linalg.jl:188 [inlined]
  [7] mul!
    @ ~/julia/share/julia/stdlib/v1.7/LinearAlgebra/src/matmul.jl:275 [inlined]
  [8] *(A::CUDA.CuArray{ComplexF64, 2, CUDA.Mem.DeviceBuffer}, x::CUDA.CuArray{ComplexF64, 1, CUDA.Mem.DeviceBuffer})
    @ LinearAlgebra ~/julia/share/julia/stdlib/v1.7/LinearAlgebra/src/matmul.jl:51
  [9] etm_propagate_gpu(sup::RigorousCoupledWaveAnalysis.Common.Halfspace, sub::RigorousCoupledWaveAnalysis.Common.Halfspace, ems_gpu::Vector{RigorousCoupledWaveAnalysis.Common.Eigenmodes}, ψin::CUDA.CuArray{ComplexF64, 1, CUDA.Mem.DeviceBuffer}, get_r::Bool)
    @ RigorousCoupledWaveAnalysis.ETM ~/jianguoyun/Nutstore/RigorousCoupledWaveAnalysis.jl-master/src/ETM/ETM.jl:133
 [10] etm_propagate
    @ ~/jianguoyun/Nutstore/RigorousCoupledWaveAnalysis.jl-master/src/ETM/ETM.jl:43 [inlined]
 [11] etm_propagate(sup::RigorousCoupledWaveAnalysis.Common.Halfspace, sub::RigorousCoupledWaveAnalysis.Common.Halfspace, em::Vector{RigorousCoupledWaveAnalysis.Common.Eigenmodes}, ψin::CUDA.CuArray{ComplexF64, 1, CUDA.Mem.DeviceBuffer}, grd::RigorousCoupledWaveAnalysis.Common.RCWAGrid)
    @ RigorousCoupledWaveAnalysis.ETM ~/jianguoyun/Nutstore/RigorousCoupledWaveAnalysis.jl-master/src/ETM/ETM.jl:38
 [12] top-level scope
    @ ~/jianguoyun/Nutstore/RigorousCoupledWaveAnalysis.jl-master/examples/test.augel2018.jl:56
in expression starting at /home/dabajabaza/jianguoyun/Nutstore/RigorousCoupledWaveAnalysis.jl-master/examples/test.augel2018.jl:56

Anyone knows what is going wrong?

Lian_Yunlong · July 20, 2022, 10:19am

I don’t have any issues when I run the code line by line in VSCode … I got the expected results.

Lian_Yunlong · July 20, 2022, 10:40am

perhaps this is related to stream synchronization because when I profile the code, I found this line is involked many times and takes a lot of time:

github.com

JuliaGPU/CUDA.jl/blob/8e58e13df1e8e55ccf5e9b9fdc2a9200852fc845/lib/cudadrv/stream.jl#L169


      
              notify(event)
          end
          # if an error occurs, the callback may never fire, so use a timer to detect such cases
          dev = device()
          timer = Timer(0; interval=1)
          Base.@sync begin
              Threads.@spawn try
                  device!(dev)
                  while true
                      try
                          Base.wait(timer)
                      catch err
                          err isa EOFError && break
                          rethrow()
                      end
                      if unsafe_cuStreamQuery(stream) != ERROR_NOT_READY
                          break
                      end
                  end
              finally
                  notify(event)

Lian_Yunlong · July 20, 2022, 12:10pm

It is perhaps related to stream-ordered allocations

Lian_Yunlong · July 20, 2022, 12:10pm

still working on this issue…
I have pinned the error to the following:

inside a function f(cu_A, cu_v), multiplication of the CuVector cu_v by the CuArray cu_A caused the error. synchronize() does not help. Running the code line by line in VSCode will not cause the error

Topic		Replies	Views
Error in CUDA testing in VSCode New to Julia question , package , gpu , vscode	5	434	April 5, 2022
LinearAlgebra./ breaks CuArray GPU question	4	567	July 23, 2022
Cuda-memcheck reports over 1300 errors with 4 lines of julia code with CUDA.jl GPU question , cudanative , cudajl	2	702	July 20, 2022
CUDA.jl finds CUDA installation but cannot execute any commands GPU	4	506	June 4, 2021
Error in CUDA.jl GPU	4	1298	September 2, 2021

CUDA.jl mystery : VSCode + Julia extension works fine but commandline run fails

Related topics