I have installed CUDA.jl . I have checked with JULIA_DEBUG=CUDA
and CUDA.version()
that indeed the local CUDA install is being used. But to my chagrin, not even the most simple code works:
julia> using CUDA
julia> CUDA.version()
v"11.0.0"
julia> CUDA.functional(true)
true
julia> a = CuArray{Int}(undef, 1024);
julia> b = copy(a);
julia> fill!(b, 0)
ERROR: cfunction: closures are not supported on this platform
Stacktrace:
[1] compile_method_instance(job::GPUCompiler.CompilerJob, method_instance::Core.MethodInstance)
@ GPUCompiler ~/.julia/packages/GPUCompiler/eJOtJ/src/jlgen.jl:325
[2] macro expansion
@ ~/.julia/packages/TimerOutputs/PZq45/src/TimerOutput.jl:226 [inlined]
[3] irgen(job::GPUCompiler.CompilerJob, method_instance::Core.MethodInstance)
@ GPUCompiler ~/.julia/packages/GPUCompiler/eJOtJ/src/irgen.jl:4
[4] macro expansion
@ ~/.julia/packages/GPUCompiler/eJOtJ/src/driver.jl:142 [inlined]
[5] macro expansion
@ ~/.julia/packages/TimerOutputs/PZq45/src/TimerOutput.jl:226 [inlined]
[6] macro expansion
@ ~/.julia/packages/GPUCompiler/eJOtJ/src/driver.jl:141 [inlined]
[7] emit_llvm(job::GPUCompiler.CompilerJob, method_instance::Any, world::UInt64; libraries::Bool, deferred_codegen::Bool, optimize::Bool, only_entry::Bool)
@ GPUCompiler ~/.julia/packages/GPUCompiler/eJOtJ/src/utils.jl:62
[8] emit_llvm(job::GPUCompiler.CompilerJob, method_instance::Any, world::UInt64)
@ GPUCompiler ~/.julia/packages/GPUCompiler/eJOtJ/src/utils.jl:60
[9] cufunction_compile(job::GPUCompiler.CompilerJob)
@ CUDA ~/.julia/packages/CUDA/3VnCC/src/compiler/execution.jl:300
[10] check_cache
@ ~/.julia/packages/GPUCompiler/eJOtJ/src/cache.jl:47 [inlined]
[11] cached_compilation
@ ~/.julia/packages/GPUArrays/Z5nPF/src/host/construction.jl:6 [inlined]
[12] cached_compilation(cache::Dict{UInt64, Any}, job::GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{GPUArrays.var"#4#5", Tuple{CUDA.CuKernelContext, CuDeviceVector{Int64, 1}, Int64}}}, compiler::typeof(CUDA.cufunction_compile), linker::typeof(CUDA.cufunction_link))
@ GPUCompiler ~/.julia/packages/GPUCompiler/eJOtJ/src/cache.jl:0
[13] cufunction(f::GPUArrays.var"#4#5", tt::Type{Tuple{CUDA.CuKernelContext, CuDeviceVector{Int64, 1}, Int64}}; name::Nothing, kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
@ CUDA ~/.julia/packages/CUDA/3VnCC/src/compiler/execution.jl:289
[14] cufunction
@ ~/.julia/packages/CUDA/3VnCC/src/compiler/execution.jl:283 [inlined]
[15] macro expansion
@ ~/.julia/packages/CUDA/3VnCC/src/compiler/execution.jl:102 [inlined]
[16] #launch_heuristic#286
@ ~/.julia/packages/CUDA/3VnCC/src/gpuarrays.jl:17 [inlined]
[17] launch_heuristic
@ ~/.julia/packages/CUDA/3VnCC/src/gpuarrays.jl:17 [inlined]
[18] gpu_call(::GPUArrays.var"#4#5", ::CuArray{Int64, 1}, ::Int64; target::CuArray{Int64, 1}, total_threads::Nothing, threads::Nothing, blocks::Nothing, name::Nothing)
@ GPUArrays ~/.julia/packages/GPUArrays/Z5nPF/src/device/execution.jl:61
[19] gpu_call
@ ~/.julia/packages/GPUArrays/Z5nPF/src/device/execution.jl:46 [inlined]
[20] fill!(A::CuArray{Int64, 1}, x::Int64)
@ GPUArrays ~/.julia/packages/GPUArrays/Z5nPF/src/host/construction.jl:5
[21] top-level scope
@ REPL[14]:1
[22] top-level scope
@ ~/.julia/packages/CUDA/3VnCC/src/initialization.jl:81
Or
julia> cu(rand(5)) .* cu(rand(5))
[same error]
But,
julia> cu(rand(5,5))*cu(rand(5))
5-element CuArray{Float32, 1}:
1.8394606
1.5616181
1.4858192
1.0533601
1.7902193
My main problem, though, is that I can’t use a Flux model with the GPU.
Any idea how I can correct this behavior?