CUDA Format of nvvmreflect function not recognized

Mark_Lau · March 21, 2023, 1:12am

I have some code that is similar to an example from the CUDA.jl Introduction. Specifically, this one:

using CUDA

N = 2^20

x_d = CUDA.fill(1.0f0, N)  # a vector stored on the GPU filled with 1.0 (Float32)
y_d = CUDA.fill(2.0f0, N)  # a vector stored on the GPU filled with 2.0

function gpu_add2!(y, x)
    index = threadIdx().x    # this example only requires linear indexing, so just use `x`
    stride = blockDim().x
    for i = index:stride:length(y)
        @inbounds y[i] += x[i]
    end
    return nothing
end

fill!(y_d, 2)
@cuda threads=256 gpu_add2!(y_d, x_d)
@test all(Array(y_d) .== 3.0f0)

Running this example seems to cause an error on my computer:

ERROR: Format of __nvvm__reflect function not recognized
Stacktrace:
  [1] error(s::String)
    @ Base .\error.jl:35
  [2] macro expansion
    @ C:\Users\Mark Lau\.julia\packages\GPUCompiler\S3TWf\src\ptx.jl:439 [inlined]
  [3] macro expansion
    @ C:\Users\Mark Lau\.julia\packages\TimerOutputs\LHjFw\src\TimerOutput.jl:253 [inlined]
  [4] nvvm_reflect!(fun::LLVM.Function)
    @ GPUCompiler C:\Users\Mark Lau\.julia\packages\GPUCompiler\S3TWf\src\ptx.jl:413       
  [5] function_pass_callback(ptr::Ptr{Nothing}, data::Ptr{Nothing})
    @ LLVM C:\Users\Mark Lau\.julia\packages\LLVM\X1AeZ\src\pass.jl:49
  [6] LLVMRunPassManager
    @ C:\Users\Mark Lau\.julia\packages\LLVM\X1AeZ\lib\13\libLLVM_h.jl:4898 [inlined]
  [7] run!
    @ C:\Users\Mark Lau\.julia\packages\LLVM\X1AeZ\src\passmanager.jl:39 [inlined]
  [8] macro expansion
    @ C:\Users\Mark Lau\.julia\packages\LLVM\X1AeZ\src\base.jl:102 [inlined]
  [9] optimize_module!(job::GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget}, mod::LLVM.Module)
    @ GPUCompiler C:\Users\Mark Lau\.julia\packages\GPUCompiler\S3TWf\src\ptx.jl:149
 [10] optimize!(job::GPUCompiler.CompilerJob, mod::LLVM.Module)
    @ GPUCompiler C:\Users\Mark Lau\.julia\packages\GPUCompiler\S3TWf\src\optim.jl:245
 [11] macro expansion
    @ C:\Users\Mark Lau\.julia\packages\GPUCompiler\S3TWf\src\driver.jl:342 [inlined]
 [12] macro expansion
    @ C:\Users\Mark Lau\.julia\packages\TimerOutputs\LHjFw\src\TimerOutput.jl:253 [inlined]
 [13] macro expansion
    @ C:\Users\Mark Lau\.julia\packages\GPUCompiler\S3TWf\src\driver.jl:341 [inlined]
 [14] macro expansion
    @ C:\Users\Mark Lau\.julia\packages\TimerOutputs\LHjFw\src\TimerOutput.jl:253 [inlined]
 [15] macro expansion
    @ C:\Users\Mark Lau\.julia\packages\GPUCompiler\S3TWf\src\driver.jl:331 [inlined]
 [16] emit_llvm(job::GPUCompiler.CompilerJob, method_instance::Any; libraries::Bool, deferred_codegen::Bool, optimize::Bool, cleanup::Bool, only_entry::Bool, validate::Bool, ctx::LLVM.Context)
    @ GPUCompiler C:\Users\Mark Lau\.julia\packages\GPUCompiler\S3TWf\src\utils.jl:83
 [17] cufunction_compile(job::GPUCompiler.CompilerJob, ctx::LLVM.Context)
    @ CUDA C:\Users\Mark Lau\.julia\packages\CUDA\ZdCxS\src\compiler\execution.jl:360
 [18] #221
    @ C:\Users\Mark Lau\.julia\packages\CUDA\ZdCxS\src\compiler\execution.jl:354 [inlined]
 [19] JuliaContext(f::CUDA.var"#221#222"{GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{typeof(gpu_add2!), Tuple{CuDeviceVector{Float32, 1}, CuDeviceVector{Float32, 1}}}}})
    @ GPUCompiler C:\Users\Mark Lau\.julia\packages\GPUCompiler\S3TWf\src\driver.jl:76
 [20] cufunction_compile(job::GPUCompiler.CompilerJob)
    @ CUDA C:\Users\Mark Lau\.julia\packages\CUDA\ZdCxS\src\compiler\execution.jl:353
 [21] cached_compilation(cache::Dict{UInt64, Any}, job::GPUCompiler.CompilerJob, compiler::typeof(CUDA.cufunction_compile), linker::typeof(CUDA.cufunction_link))
    @ GPUCompiler C:\Users\Mark Lau\.julia\packages\GPUCompiler\S3TWf\src\cache.jl:90
 [22] cufunction(f::typeof(gpu_add2!), tt::Type{Tuple{CuDeviceVector{Float32, 1}, CuDeviceVector{Float32, 1}}}; name::Nothing, always_inline::Bool, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ CUDA C:\Users\Mark Lau\.julia\packages\CUDA\ZdCxS\src\compiler\execution.jl:306
 [23] cufunction(f::typeof(gpu_add2!), tt::Type{Tuple{CuDeviceVector{Float32, 1}, CuDeviceVector{Float32, 1}}})
    @ CUDA C:\Users\Mark Lau\.julia\packages\CUDA\ZdCxS\src\compiler\execution.jl:299
 [24] top-level scope
    @ C:\Users\Mark Lau\.julia\packages\CUDA\ZdCxS\src\compiler\execution.jl:102

I used to be able to run code like this, but after a recent updating of packages through the package manager, I seem to run into this problem, so maybe that has something to do with it.

Here’s my CUDA.versioninfo() output:

CUDA.versioninfo()
CUDA runtime 11.8, artifact installation
CUDA driver 12.0
Unknown NVIDIA driver

Libraries:
- CUBLAS: 11.11.3
- CURAND: 10.3.0
- CUFFT: 10.9.0
- CUSOLVER: 11.4.1
- CUSPARSE: 11.7.5
- CUPTI: 18.0.0
- NVML: missing

Toolchain:
- Julia: 1.8.5
- LLVM: 13.0.1
- PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4, 6.5, 7.0, 7.1, 7.2
- Device capability support: sm_35, sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75, sm_80, sm_86

1 device:
  0: NVIDIA GeForce GTX 1070 (sm_61, 7.030 GiB / 8.000 GiB available)

Has anyone seen this error before? Any ideas what might be causing this?

fnin · March 21, 2023, 9:35am

I’ve also recently encountered this error in my GPU code. The same code was working fine a couple weeks ago.

fnin · March 21, 2023, 9:43am

A minimal example of the breaking code in my case is:

using CUDA
v = CUDA.rand(40)
ev = exp.(v)

with

CUDA.versioninfo()
CUDA runtime 11.8, artifact installation
CUDA driver 12.0
NVIDIA driver 525.89.2

Libraries: 
- CUBLAS: 11.11.3
- CURAND: 10.3.0
- CUFFT: 10.9.0
- CUSOLVER: 11.4.1
- CUSPARSE: 11.7.5
- CUPTI: 18.0.0
- NVML: 12.0.0+525.89.2

Toolchain:
- Julia: 1.9.0-rc1
- LLVM: 14.0.6
- PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4, 6.5, 7.0, 7.1, 7.2, 7.3, 7.4, 7.5
- Device capability support: sm_35, sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75, sm_80, sm_86

1 device:
  0: NVIDIA GeForce RTX 3090 (sm_86, 22.648 GiB / 24.000 GiB available)

maleadt · March 21, 2023, 10:23am

Let’s track this in Broadcasting with a range is now broken · Issue #1816 · JuliaGPU/CUDA.jl · GitHub

maleadt · March 21, 2023, 12:13pm

Should be fixed if you upgrade packages (specifically LLVM.jl).

Mark_Lau · March 21, 2023, 1:54pm

It works now, thanks!

droodman · March 22, 2023, 2:18am

I’m getting the same error in Julia 1.8.5 with:

using CUDA, FixedEffectModels, RDatasets
df = dataset("plm", "Cigar")
reg(df, @formula(Sales ~ NDI + fe(State) + fe(Year)), method = :gpu)

This example comes from GitHub - FixedEffects/FixedEffectModels.jl: Fast Estimation of Linear Models with IV and High Dimensional Categorical Variables.
This despite having run “up” in the PackageManager… Is there something else I should be doing?

Mark_Lau · March 22, 2023, 2:25am

Shortly after LLVM v4.17.1 got released, I tried to do the same, but was experiencing the same error. For some reason, explicitly telling the package manager to update the General registery via registry up General before doing the regular update worked for me.

droodman · March 22, 2023, 2:52am

Thanks for the suggestion @Mark_Lau. I tried that and am getting the same error.

maleadt · March 22, 2023, 8:59am

PkgServers are lagging behind. The fix is in LLVM.jl 4.17.1, so you should ensure you have that version.

pazzo83 · March 22, 2023, 6:47pm

I am also not able to update to 4.17.1 and I’m wondering if this is the issue (from the Compat file for LLMV in the package registry):

It seems 4.17.1 actually requires a slightly older version of the LLVMExtra_jll library?

maleadt · March 23, 2023, 10:54am

PkgServers have caught up, so if you’re still experiencing this issue, that may be the culprit. Could you confirm?

pazzo83 · March 23, 2023, 4:19pm

All good here!! Thank you!

Topic		Replies	Views
Errors during test CUDA GPU	0	572	January 11, 2021
LLVM crash when running Flux and CuArray examples in julia 0.7 GPU cudanative , bug , debugging , flux	13	1676	August 21, 2018
Using CuArrays returns incompatibility between CUDA and LLVM GPU	1	635	March 25, 2020
Failing test CUDA (obviously LLVM issue) GPU	1	530	September 2, 2021
Weird error with CUDA.jl on Julia1.7. Cannot rewrite unknown use of function GPU	2	540	December 16, 2021

CUDA Format of __nvvm__reflect function not recognized

Related topics

CUDA Format of nvvmreflect function not recognized