A compiler error occurs when broadcasting a Float power function on a CuArray: .^(::CuArray{Float32,1}, ::Float32)
I opened an issue on github as well here. I post here in case this is a usage error.
The Minimal Working Example for this bug:
using CuArrays, Flux
w = gpu(collect(1:10))
w.^2 # works
w.^2.0 # error
w.^2.0f0 # error
Stacktrace:
ERROR: InvalidIRError: compiling #23(CuArrays.CuKernelState, CUDAnative.CuDeviceArray{Float32,1,CUDAnative.AS.Global}, Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(^),Tuple{Base.Broadcast.Extruded{CUDAnative.CuDeviceArray{Float32,1,CUDAnative.AS.Global},Tuple{Bool},Tuple{Int64}},Float32}}) resulted in invalid LLVM IR
Reason: unsupported call through a literal pointer (call to jl_alloc_string)
Stacktrace:
[1] _string_n at strings/string.jl:60
[2] string at strings/substring.jl:180
[3] throw_exp_domainerror at math.jl:35
[4] ^ at math.jl:789
[5] _broadcast_getindex_evalf at broadcast.jl:578
[6] _broadcast_getindex at broadcast.jl:551
[7] getindex at broadcast.jl:511
[8] #23 at C:\Users\Henri\.julia\packages\GPUArrays\t8tJB\src\broadcast.jl:50
Reason: unsupported call through a literal pointer (call to memcpy)
Stacktrace:
[1] unsafe_copyto! at array.jl:225
[2] __unsafe_string! at strings/substring.jl:167
[3] string at strings/substring.jl:183
[4] throw_exp_domainerror at math.jl:35
[5] ^ at math.jl:789
[6] _broadcast_getindex_evalf at broadcast.jl:578
[7] _broadcast_getindex at broadcast.jl:551
[8] getindex at broadcast.jl:511
[9] #23 at C:\Users\Henri\.julia\packages\GPUArrays\t8tJB\src\broadcast.jl:50
Reason: unsupported call to the Julia runtime (call to jl_box_float32)
Stacktrace:
[1] throw_exp_domainerror at math.jl:35
[2] ^ at math.jl:789
[3] _broadcast_getindex_evalf at broadcast.jl:578
[4] _broadcast_getindex at broadcast.jl:551
[5] getindex at broadcast.jl:511
[6] #23 at C:\Users\Henri\.julia\packages\GPUArrays\t8tJB\src\broadcast.jl:50
Stacktrace:
[1] check_ir(::CUDAnative.CompilerContext, ::LLVM.Module) at C:\Users\Henri\.julia\packages\CUDAnative\PFgO3\src\compiler\validation.jl:77
[2] compile(::CUDAnative.CompilerContext) at C:\Users\Henri\.julia\packages\CUDAnative\PFgO3\src\compiler\driver.jl:97
[3] #compile#109(::Bool, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::VersionNumber, ::Any, ::Any)
at C:\Users\Henri\.julia\packages\CUDAnative\PFgO3\src\compiler\driver.jl:45
[4] compile at C:\Users\Henri\.julia\packages\CUDAnative\PFgO3\src\compiler\driver.jl:43 [inlined]
[5] #compile#108(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::CUDAdrv.CuDevice, ::Function, ::Any)
at C:\Users\Henri\.julia\packages\CUDAnative\PFgO3\src\compiler\driver.jl:18
[6] compile at C:\Users\Henri\.julia\packages\CUDAnative\PFgO3\src\compiler\driver.jl:16 [inlined]
[7] macro expansion at C:\Users\Henri\.julia\packages\CUDAnative\PFgO3\src\execution.jl:269 [inlined]
[8] #cufunction#123(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(CUDAnative.cufunction), ::getfield(GPUArrays, Symbol("##23#24")), ::Type{Tuple{CuArrays.CuKernelState,CUDAnative.CuDeviceArray{Float32,1,CUDAnative.AS.Global},Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(^),Tuple{Base.Broadcast.Extruded{CUDAnative.CuDeviceArray{Float32,1,CUDAnative.AS.Global},Tuple{Bool},Tuple{Int64}},Float32}}}}) at C:\Users\Henri\.julia\packages\CUDAnative\PFgO3\src\execution.jl:240
[9] cufunction(::Function, ::Type) at C:\Users\Henri\.julia\packages\CUDAnative\PFgO3\src\execution.jl:240
[10] macro expansion at C:\Users\Henri\.julia\packages\CUDAnative\PFgO3\src\execution.jl:208 [inlined]
[11] macro expansion at .\gcutils.jl:87 [inlined]
[12] macro expansion at C:\Users\Henri\.julia\packages\CUDAnative\PFgO3\src\execution.jl:205 [inlined]
[13] _gpu_call(::CuArrays.CuArrayBackend, ::Function, ::CuArray{Float32,1}, ::Tuple{CuArray{Float32,1},Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(^),Tuple{Base.Broadcast.Extruded{CuArray{Float32,1},Tuple{Bool},Tuple{Int64}},Float32}}}, ::Tuple{Tuple{Int64},Tuple{Int64}}) at C:\Users\Henri\.julia\packages\CuArrays\qZCAt\src\gpuarray_interface.jl:59
[14] gpu_call(::Function, ::CuArray{Float32,1}, ::Tuple{CuArray{Float32,1},Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(^),Tuple{Base.Broadcast.Extruded{CuArray{Float32,1},Tuple{Bool},Tuple{Int64}},Float32}}}, ::Int64) at C:\Users\Henri\.julia\packages\GPUArrays\t8tJB\src\abstract_gpu_interface.jl:151
[15] gpu_call at C:\Users\Henri\.julia\packages\GPUArrays\t8tJB\src\abstract_gpu_interface.jl:128 [inlined]
[16] copyto! at C:\Users\Henri\.julia\packages\GPUArrays\t8tJB\src\broadcast.jl:48 [inlined]
[17] copyto! at .\broadcast.jl:797 [inlined]
[18] copy at .\broadcast.jl:773 [inlined]
[19] materialize(::Base.Broadcast.Broadcasted{Base.Broadcast.ArrayStyle{CuArray},Nothing,typeof(^),Tuple{CuArray{Float32,1},Float32}}) at .\broadcast.jl:753
[20] top-level scope at none:0
Build log
Building MbedTLS ─────────→ `C:\Users\Henri\.julia\packages\MbedTLS\X4xar\deps\build.log`
Building WebIO ───────────→ `C:\Users\Henri\.julia\packages\WebIO\7G1ZY\deps\build.log`
Building Conda ───────────→ `C:\Users\Henri\.julia\packages\Conda\CpuvI\deps\build.log`
Building FFTW ────────────→ `C:\Users\Henri\.julia\packages\FFTW\p7sLQ\deps\build.log`
Building SpecialFunctions → `C:\Users\Henri\.julia\packages\SpecialFunctions\fvheQ\deps\build.log`
Building Rmath ───────────→ `C:\Users\Henri\.julia\packages\Rmath\Py9gH\deps\build.log`
Building PyCall ──────────→ `C:\Users\Henri\.julia\packages\PyCall\ttONZ\deps\build.log`
Building CUDAdrv ─────────→ `C:\Users\Henri\.julia\packages\CUDAdrv\lu32K\deps\build.log`
Building GR ──────────────→ `C:\Users\Henri\.julia\packages\GR\KGODl\deps\build.log`
Building LLVM ────────────→ `C:\Users\Henri\.julia\packages\LLVM\tg8MX\deps\build.log`
Building CodecZlib ───────→ `C:\Users\Henri\.julia\packages\CodecZlib\9jDi1\deps\build.log`
Building Arpack ──────────→ `C:\Users\Henri\.julia\packages\Arpack\cu5By\deps\build.log`
Building ZipFile ─────────→ `C:\Users\Henri\.julia\packages\ZipFile\YHTbb\deps\build.log`
Building CUDAnative ──────→ `C:\Users\Henri\.julia\packages\CUDAnative\PFgO3\deps\build.log`
Building Plots ───────────→ `C:\Users\Henri\.julia\packages\Plots\47Tik\deps\build.log`
Building CuArrays ────────→ `C:\Users\Henri\.julia\packages\CuArrays\qZCAt\deps\build.log`
Environment details
Details on Julia:
Julia Version 1.1.0
Commit 80516ca202 (2019-01-21 21:24 UTC)
Platform Info:
OS: Windows (x86_64-w64-mingw32)
CPU: Intel(R) Core(TM) i5-7300HQ CPU @ 2.50GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-6.0.1 (ORCJIT, skylake)
Environment:
JULIA_EDITOR = "C:\Users\Henri\AppData\Local\atom\app-1.37.0\atom.exe" -a
JULIA_NUM_THREADS = 2
Julia packages:
- CuArrays.jl
- Flux.jl
CUDA: toolkit version: v10.1 and driver version: I don’t know where to find that information