ERROR: InvalidIRError: compiling reduce_kernel with `eigen` command

sergevic · September 3, 2019, 12:14am

I’m trying to convert my Julia code with CUDA programming, because I get a OutOfMemory() error when I run the original code on Julia v.1.2. My converted code starts with the following lines:

using CuArrays
using Distances
using LinearAlgebra
using Distributions
data=Float32.(rand(10000,15))
Eucldist=pairwise(Euclidean(),data,dims=1)
D=maximum(Eucldist.^2)
sigma2hat=mean(((Eucldist.^2)./D)[tril!(trues(size((Eucldist.^2)./D)),-1)])
L=exp.(-(Eucldist.^2/D)/(2*sigma2hat))
L=cu(L)
K=eigen(L)

With the last command, I get the following error:

┌ Warning: Performing scalar operations on GPU arrays: This is very slow, consider disallowing these operations with allowscalar(false)
└ @ GPUArrays C:\Users\User.julia\packages\GPUArrays\J4c3Q\src\indexing.jl:16
ERROR: InvalidIRError: compiling reduce_kernel(CuArrays.CuKernelState, typeof(==), typeof(&), Bool, CuDeviceArray{Float32,2,CUDAnative.AS.Global}, Val{256}, CuDeviceArray{Bool,1,CUDAnative.AS.Global}, Adjoint{Float32,CuDeviceArray{Float32,2,CUDAnative.AS.Global}}) resulted in invalid LLVM IR
Reason: unsupported dynamic function invocation (call to simple_broadcast_index)
Stacktrace:
[1] reduce_kernel at C:\Users\User.julia\packages\GPUArrays\J4c3Q\src\mapreduce.jl:141
Stacktrace:
[1] check_ir(::CUDAnative.CompilerJob, ::LLVM.Module) at C:\Users\User.julia\packages\CUDAnative\LkH1v\src\compiler\validation.jl:114
[2] macro expansion at C:\Users\User.julia\packages\CUDAnative\LkH1v\src\compiler\driver.jl:188 [inlined]
[3] macro expansion at C:\Users\User.julia\packages\TimerOutputs\7zSea\src\TimerOutput.jl:216 [inlined]
[4] #codegen#130(::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::typeof(CUDAnative.codegen), ::Symbol, ::CUDAnative.CompilerJob) at C:\Users\User.julia\packages\CUDAnative\LkH1v\src\compiler\driver.jl:186
[5] #codegen at .\none:0 [inlined]
[6] #compile#129(::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::typeof(CUDAnative.compile), ::Symbol, ::CUDAnative.CompilerJob) at C:\Users\User.julia\packages\CUDAnative\LkH1v\src\compiler\driver.jl:47
[7] #compile#128 at .\none:0 [inlined]
[8] #compile at .\none:0 [inlined] (repeats 2 times)
[9] macro expansion at C:\Users\User.julia\packages\CUDAnative\LkH1v\src\execution.jl:389 [inlined]
[10] #cufunction#170(::Nothing, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{,Tuple{}}}, ::typeof(cufunction), ::typeof(GPUArrays.reduce_kernel), ::Type{Tuple{CuArrays.CuKernelState,typeof(==),typeof(&),Bool,CuDeviceArray{Float32,2,CUDAnative.AS.Global},Val{256},CuDeviceArray{Bool,1,CUDAnative.AS.Global},Adjoint{Float32,CuDeviceArray{Float32,2,CUDAnative.AS.Global}}}}) at C:\Users\User.julia\packages\CUDAnative\LkH1v\src\execution.jl:357
[11] cufunction(::Function, ::Type) at C:\Users\User.julia\packages\CUDAnative\LkH1v\src\execution.jl:357
[12] macro expansion at C:\Users\User.julia\packages\CUDAnative\LkH1v\src\execution.jl:174 [inlined]
[13] macro expansion at .\gcutils.jl:87 [inlined]
[14] macro expansion at C:\Users\User.julia\packages\CUDAnative\LkH1v\src\execution.jl:171 [inlined]
[15] _gpu_call(::CuArrays.CuArrayBackend, ::Function, ::CuArray{Bool,1}, ::Tuple{typeof(==),typeof(&),Bool,CuArray{Float32,2},Val{256},CuArray{Bool,1},Adjoint{Float32,CuArray{Float32,2}}}, ::Tuple{Tuple{Int64},Tuple{Int64}}) at C:\Users\User.julia\packages\CuArrays\wXQp8\src\gpuarray_interface.jl:60
[16] gpu_call(::Function, ::CuArray{Bool,1}, ::Tuple{typeof(==),typeof(&),Bool,CuArray{Float32,2},Val{256},CuArray{Bool,1},Adjoint{Float32,CuArray{Float32,2}}}, ::Tuple{Tuple{Int64},Tuple{Int64}}) at C:\Users\User.julia\packages\GPUArrays\J4c3Q\src\abstract_gpu_interface.jl:151
[17] acc_mapreduce(::Function, ::Function, ::Bool, ::CuArray{Float32,2}, ::Tuple{Adjoint{Float32,CuArray{Float32,2}}}) at C:\Users\User.julia\packages\GPUArrays\J4c3Q\src\mapreduce.jl:186
[18] ishermitian at C:\Users\User.julia\packages\GPUArrays\J4c3Q\src\mapreduce.jl:15 [inlined]
[19] issymmetric at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\LinearAlgebra\src\generic.jl:1009 [inlined]
[20] #eigen!#56(::Bool, ::Bool, ::typeof(LinearAlgebra.eigsortby), ::typeof(eigen!), ::CuArray{Float32,2}) at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\LinearAlgebra\src\eigen.jl:53
[21] #eigen! at .\none:0 [inlined]
[22] #eigen#58(::Bool, ::Bool, ::typeof(LinearAlgebra.eigsortby), ::typeof(eigen), ::CuArray{Float32,2}) at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\LinearAlgebra\src\eigen.jl:139
[23] eigen(::CuArray{Float32,2}) at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\LinearAlgebra\src\eigen.jl:137
[24] top-level scope at none:0

According to https://github.com/JuliaGPU/CuArrays.jl “because CuArray is an AbstractArray , it doesn’t have much of a learning curve; just use your favourite array ops as usual.” So what am I dong wrong if I cannot perform a simple eigendecomposition…? And that’s just the beginning of my code…

vchuravy · September 3, 2019, 12:25am

CuArrays indeed aims to provide a relatively simple programming model for GPUs. We generally aim to have linear algebra and broadcast/map just work for users, but that doesn’t mean all packages just work. Examples are functions that are for loops over GPU memory and therefore do scalar indexing into GPU memory (that will be rather slow).

Can you post the full stacktrace? It sounds like somewhere something is calling pointer on CuArray and that is not possible, since most pieces of code that operate on Ptr expect it to be a CPU address.

I started answering your post before you updated it.

The updated snippet indeed seems like a bug in probably GPUArrays.jl

sergevic · September 3, 2019, 12:28am

I put the whole stacktrace in my post From [1] to [24]

vchuravy · September 3, 2019, 12:28am

Yeah that was a response to version 1 of your post

https://github.com/JuliaGPU/GPUArrays.jl/issues/201

sergevic · September 3, 2019, 12:31am

Yes I tried to put more lines of my code

Topic		Replies	Views
Strange behavior of `mapreduce` GPU	2	694	November 16, 2018
Question about CUDA kernels GPU question	4	588	February 10, 2023
Cuda kernel error New to Julia gpu	9	2896	December 9, 2019
GPU Map without reduction on multiple arrays indices GPU	1	693	February 8, 2019
Complex numbers on GPU New to Julia gpu , complex-numbers	3	2297	August 1, 2018

ERROR: InvalidIRError: compiling reduce_kernel with `eigen` command

Related topics