Problem with running Julia 1.6.7 with CUDA 11.7

I have a problem with running CUDA on Julia and can’t figure out what is going wrong. Allocating seems to work fine but any kind of operation creates large bug reports.

using CUDA
CUDA.functional() # returns true
N = 2^20
x = CUDA.fill(1.0f0, N)  # a vector filled with 1.0 (Float32) #Works fine
y = CUDA.fill(2.0f0, N)  # a vector filled with 2.0 ##Works fine

y .+= x  # Makes the REPL in VS code crash, gives large bug reports in Julia REPL.

This is what Julia version that is used:
Julia Version 1.6.7
Commit 3b76b25b64 (2022-07-19 15:11 UTC)
Platform Info:
OS: Windows (x86_64-w64-mingw32)
CPU: Intel(R) Core™ i5-8400 CPU @ 2.80GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-11.0.1 (ORCJIT, skylake)
Environment:
JULIA_EDITOR = code
JULIA_NUM_THREADS =

This is the CUDA versioninfo:
CUDA toolkit 11.7, artifact installation
NVIDIA driver 516.94.0, for CUDA 11.7
CUDA driver 11.7

Libraries:

  • CUBLAS: 11.10.1
  • CURAND: 10.2.10
  • CUFFT: 10.7.1
  • CUSOLVER: 11.3.5
  • CUSPARSE: 11.7.3
  • CUPTI: 17.0.0
  • NVML: 11.0.0+516.94
  • CUDNN: 8.30.2 (for CUDA 11.5.0)
  • CUTENSOR: 1.4.0 (for CUDA 11.5.0)

Toolchain:

  • Julia: 1.6.7
  • LLVM: 11.0.1
  • PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4, 6.5, 7.0
  • Device capability support: sm_35, sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75, sm_80

1 device:
0: NVIDIA GeForce GTX 1070 (sm_61, 7.175 GiB / 8.000 GiB available)

And this is the error report after running the code above (it also gives a similar error message for each test when running pkg> test CUDA).

Please submit a bug report with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
Exception: EXCEPTION_ACCESS_VIOLATION at 0x7ebb21f – ZN4llvm25remapInstructionsInBlocksERKNS_15SmallVectorImplIPNS_10BasicBlockEEERNS_8ValueMapIPKNS_5ValueENS_14WeakTrackingVHENS_14ValueMapConfigIS9_NS_3sys10SmartMutexILb0EEEEEEE at C:\Users\Paldßn\myOwnPrograms\Julia-1.6.7\bin\LLVM.dll (unknown line)
in expression starting at REPL[10]:1
ZN4llvm25remapInstructionsInBlocksERKNS_15SmallVectorImplIPNS_10BasicBlockEEERNS_8ValueMapIPKNS_5ValueENS_14WeakTrackingVHENS_14ValueMapConfigIS9_NS_3sys10SmartMutexILb0EEEEEEE at C:\Users\Paldßn\myOwnPrograms\Julia-1.6.7\bin\LLVM.dll (unknown line)
ZN4llvm17CloneFunctionIntoEPNS_8FunctionEPKS0_RNS_8ValueMapIPKNS_5ValueENS_14WeakTrackingVHENS_14ValueMapConfigIS7_NS_3sys10SmartMutexILb0EEEEEEEbRNS_15SmallVectorImplIPNS_10ReturnInstEEEPKcPNS_14ClonedCodeInfoEPNS_20ValueMapTypeRemapperEPNS_17ValueMaterializerE at C:\Users\Paldßn\myOwnPrograms\Julia-1.6.7\bin\LLVM.dll (unknown line)
LLVMCloneFunctionInto at C:\Users\Paldßn.julia\artifacts\39f327e25ea056497ed1e8b0d595b85576936986\bin\libLLVMExtra-11.dll (unknown line)
LLVMCloneFunctionInto at C:\Users\Paldán.julia\packages\LLVM\WjSQG\lib\libLLVM_extra.jl:323
#clone_into!#77 at C:\Users\Paldán.julia\packages\LLVM\WjSQG\src\utils.jl:35
clone_into!##kw at C:\Users\Paldán.julia\packages\LLVM\WjSQG\src\utils.jl:16 [inlined]
macro expansion at C:\Users\Paldán.julia\packages\GPUCompiler\iaKrd\src\irgen.jl:504 [inlined]
macro expansion at C:\Users\Paldán.julia\packages\LLVM\WjSQG\src\base.jl:102 [inlined]
macro expansion at C:\Users\Paldán.julia\packages\GPUCompiler\iaKrd\src\irgen.jl:476 [inlined]
macro expansion at C:\Users\Paldán.julia\packages\TimerOutputs\jgSVI\src\TimerOutput.jl:252 [inlined]
lower_byval at C:\Users\Paldán.julia\packages\GPUCompiler\iaKrd\src\irgen.jl:404
unknown function (ip: 000000000137cd18)
finish_module! at C:\Users\Paldán.julia\packages\GPUCompiler\iaKrd\src\ptx.jl:187
unknown function (ip: 0000000045fe19d8)
macro expansion at C:\Users\Paldán.julia\packages\GPUCompiler\iaKrd\src\driver.jl:260 [inlined]
#emit_llvm#104 at C:\Users\Paldán.julia\packages\GPUCompiler\iaKrd\src\utils.jl:64
unknown function (ip: 0000000045fc358a)
emit_llvm##kw at C:\Users\Paldán.julia\packages\GPUCompiler\iaKrd\src\utils.jl:62 [inlined]
cufunction_compile at C:\Users\Paldán.julia\packages\CUDA\DfvRa\src\compiler\execution.jl:353
#222 at C:\Users\Paldán.julia\packages\CUDA\DfvRa\src\compiler\execution.jl:347 [inlined]
JuliaContext at C:\Users\Paldán.julia\packages\GPUCompiler\iaKrd\src\driver.jl:74
unknown function (ip: 0000000045f7fc53)
cufunction_compile at C:\Users\Paldán.julia\packages\CUDA\DfvRa\src\compiler\execution.jl:346
cached_compilation at C:\Users\Paldán.julia\packages\GPUCompiler\iaKrd\src\cache.jl:90
#cufunction#219 at C:\Users\Paldán.julia\packages\CUDA\DfvRa\src\compiler\execution.jl:299
cufunction at C:\Users\Paldán.julia\packages\CUDA\DfvRa\src\compiler\execution.jl:293 [inlined]
macro expansion at C:\Users\Paldán.julia\packages\CUDA\DfvRa\src\compiler\execution.jl:102 [inlined]
#launch_heuristic#246 at C:\Users\Paldán.julia\packages\CUDA\DfvRa\src\gpuarrays.jl:17 [inlined]
launch_heuristic##kw at C:\Users\Paldán.julia\packages\CUDA\DfvRa\src\gpuarrays.jl:17
_copyto! at C:\Users\Paldán.julia\packages\GPUArrays\gok9K\src\host\broadcast.jl:73 [inlined]
materialize! at C:\Users\Paldán.julia\packages\GPUArrays\gok9K\src\host\broadcast.jl:51 [inlined]
materialize! at .\broadcast.jl:891
unknown function (ip: 0000000045f75d14)
jl_clear_implicit_imports at C:\Users\Paldßn\myOwnPrograms\Julia-1.6.7\bin\libjulia-internal.dll (unknown line)
jl_clear_implicit_imports at C:\Users\Paldßn\myOwnPrograms\Julia-1.6.7\bin\libjulia-internal.dll (unknown line)
jl_clear_implicit_imports at C:\Users\Paldßn\myOwnPrograms\Julia-1.6.7\bin\libjulia-internal.dll (unknown line)
jl_interpret_toplevel_thunk at C:\Users\Paldßn\myOwnPrograms\Julia-1.6.7\bin\libjulia-internal.dll (unknown line)
jl_toplevel_eval_flex at C:\Users\Paldßn\myOwnPrograms\Julia-1.6.7\bin\libjulia-internal.dll (unknown line)
jl_toplevel_eval_flex at C:\Users\Paldßn\myOwnPrograms\Julia-1.6.7\bin\libjulia-internal.dll (unknown line)
jl_clear_implicit_imports at C:\Users\Paldßn\myOwnPrograms\Julia-1.6.7\bin\libjulia-internal.dll (unknown line)
jl_clear_implicit_imports at C:\Users\Paldßn\myOwnPrograms\Julia-1.6.7\bin\libjulia-internal.dll (unknown line)
jl_interpret_toplevel_thunk at C:\Users\Paldßn\myOwnPrograms\Julia-1.6.7\bin\libjulia-internal.dll (unknown line)
jl_toplevel_eval_flex at C:\Users\Paldßn\myOwnPrograms\Julia-1.6.7\bin\libjulia-internal.dll (unknown line)
jl_toplevel_eval_in at C:\Users\Paldßn\myOwnPrograms\Julia-1.6.7\bin\libjulia-internal.dll (unknown line)
unknown function (ip: 000000006b492667)
unknown function (ip: 000000006b4930d5)
unknown function (ip: 000000006b07c071)
unknown function (ip: 000000006b0b5072)
unknown function (ip: 000000006b0b569e)
unknown function (ip: 000000006aebb59a)
unknown function (ip: 000000006aebb641)
jl_f__call_latest at C:\Users\Paldßn\myOwnPrograms\Julia-1.6.7\bin\libjulia-internal.dll (unknown line)
unknown function (ip: 000000006b36f449)
unknown function (ip: 000000006b37dec9)
unknown function (ip: 000000006aea7ee1)
unknown function (ip: 000000006aea807e)
jl_call2 at C:\Users\Paldßn\myOwnPrograms\Julia-1.6.7\bin\libjulia-internal.dll (unknown line)
repl_entrypoint at C:\Users\Paldßn\myOwnPrograms\Julia-1.6.7\bin\libjulia-internal.dll (unknown line)
unknown function (ip: 0000000000401a63)
BaseThreadInitThunk at C:\WINDOWS\System32\KERNEL32.DLL (unknown line)
RtlUserThreadStart at C:\WINDOWS\SYSTEM32\ntdll.dll (unknown line)
Allocations: 47087010 (Pool: 47072548; Big: 14462); GC: 51

I assume you’re using Julia builds from the home page?
Can you try upgrading Julia?

Yes, I have tried with the latest version as well, 1.8.1 but for some reason I cant add the CUDA package with that version. I get error for 4 dependencies, LLVM, GPUCompiler, GPUArrays and CUDA and when I try to precompile I get the following error:

ERROR: The following 1 direct dependency failed to precompile:

CUDA [052768ef-5323-5732-b1bb-66c8b64840ba]

Failed to precompile CUDA [052768ef-5323-5732-b1bb-66c8b64840ba] to C:\Users\Paldán.julia\compiled\v1.8\CUDA\jl_E37.tmp.
ERROR: LoadError: type Nothing has no field captures
Stacktrace:
[1] top-level scope
@ C:\Users\Paldán.julia\packages\LLVM\WjSQG\src\LLVM.jl:14
[2] top-level scope
@ stdin:1
in expression starting at C:\Users\Paldán.julia\packages\LLVM\WjSQG\src\LLVM.jl:1
in expression starting at stdin:1
ERROR: LoadError: Failed to precompile LLVM [929cbde3-209d-540e-8aea-75f648917ca0] to C:\Users\Paldán.julia\compiled\v1.8\LLVM\jl_1082.tmp.
Stacktrace:
[1] top-level scope
@ stdin:1
in expression starting at C:\Users\Paldán.julia\packages\GPUCompiler\07qaN\src\GPUCompiler.jl:1
in expression starting at stdin:1
ERROR: LoadError: Failed to precompile GPUCompiler [61eb1bfa-7361-4325-ad38-22787b887f55] to C:\Users\Paldán.julia\compiled\v1.8\GPUCompiler\jl_EEB.tmp.
Stacktrace:
[1] top-level scope
@ stdin:1
in expression starting at C:\Users\Paldán.julia\packages\CUDA\DfvRa\src\CUDA.jl:1
in expression starting at stdin:1

You should provide more details, e.g., which versions of packages you’re using. Nothing in the latest release of LLVM.jl uses regexes in the src/LLVM.jl file (part of your backtrace).

Could you be more specific? I only have the CUDA package installed.

LLVM.jl is a dependency of CUDA.jl. Try st -m in the package manager.

The problem was most likely because the name of the filepath to all my packages contained a non-ascii character ‘á’.
I tried making a new user in windows with an easier name and run everything again and it works without problem. Not sure if it should count as common sense to not have filepaths with non-ascii characters in it or if it is a problem that should be fixed :slight_smile:

That shouldn’t matter (at least ideally), and I don’t think any legal UTF-8 string would be an issue on Linux, macOS etc. I’m not sure what’s happening for you but if this is a problem for you (in general with Julia) on Windows, I suggest file an issue for it on JuliaLang, not at a package (or maybe both?).

I’m not sure what explains the differences, and not simply showing Paldán.

Even though the file system names (and Windows API) uses UTF-16 on Windows, UTF-8 is used with Python, so could you check the same paths there with (and) without the opt-in UTF-8 mode, that uses UTF-8/surrogatepass (mbcs/replace for Legacy Windows FS encoding, i.e. by default)

I only know Julia has UTF-8, not up to speed if it’s missing UTF-8/strict vs. UTF-8/surrogatepass vs.UTF-8/surrogateescape vs. UTF-8/backslashreplace.

[I recall on an issue for letters like á, from way back regarding macOS contacting Solaris, and one system used precomposed and the other not, such could also be an issue, I don’t think that’s the case here. I guess the filesystem should normalize rather than Julia.]

So do you think this is something that should be filed for in julialang?