Problem with CUDAv3

Ok, sorry. I will try to be more concise.

I have two problems:

  1. Routines that were dispatching based on the type returned by threadIdx() are now broken. MWE:
import Pkg
Pkg.activate(".")
using CUDA

Pkg.status()

idx(i::Int64) = zero(i)*1.0

function krnl_foo!(ac)
    b, r = CUDA.threadIdx().x, CUDA.blockIdx().x

    ac[b,r] = idx(b)
 return nothing
end

A = randn(10,10)
ac = CuArray(A)

CUDA.@sync begin
  @device_code_warntype    CUDA.@cuda threads=10 blocks=10 krnl_foo!(ac)
end

A = Array(ac)
A .== zero(A)

works in CUDAv3.3.3, but fails in CUDAv3.5.0.

  1. Even if this is corrected by changing the type expected by idx(), the code works, but @device_code_warntype does not find the correct types. MWE:

import Pkg
Pkg.activate(".")
using CUDA

Pkg.status()

idx(i::Int32) = zero(i)*1.0 # Change type for CUDAv3.5.0

function krnl_foo!(ac)
    b, r = CUDA.threadIdx().x, CUDA.blockIdx().x

    ac[b,r] = idx(b)
 return nothing
end

A = randn(10,10)
ac = CuArray(A)

CUDA.@sync begin
  @device_code_warntype    CUDA.@cuda threads=10 blocks=10 krnl_foo!(ac)
end

A = Array(ac)
A .== zero(A)

works in CUDAv3.5.0, but claims that b,r are of type Union:

 Activating environment at `~/code/CUDAv3.5.0/Project.toml`
      Status `~/code/CUDAv3.5.0/Project.toml`
  [052768ef] CUDA v3.5.0
PTX CompilerJob of kernel krnl_foo!(CuDeviceMatrix{Float64, 1}) for sm_70

Variables
  #self#::Core.Const(krnl_foo!)
  ac::CuDeviceMatrix{Float64, 1}
  r::Union{}
  b::Union{}

Body::Union{}
1 ─ %1 = CUDA.threadIdx::Core.Const(CUDA.threadIdx)
│        (%1)()
│        Core.Const(:(Base.getproperty(%2, :x)))
│        Core.Const(:(CUDA.blockIdx))
│        Core.Const(:((%4)()))
│        Core.Const(:(Base.getproperty(%5, :x)))
│        Core.Const(:(b = %3))
│        Core.Const(:(r = %6))
│        Core.Const(:(Main.idx(b)))
│        Core.Const(:(Base.setindex!(ac, %9, b, r)))
└──      Core.Const(:(return Main.nothing))
10×10 BitMatrix:
 1  1  1  1  1  1  1  1  1  1
 1  1  1  1  1  1  1  1  1  1
 1  1  1  1  1  1  1  1  1  1
 1  1  1  1  1  1  1  1  1  1
 1  1  1  1  1  1  1  1  1  1
 1  1  1  1  1  1  1  1  1  1
 1  1  1  1  1  1  1  1  1  1
 1  1  1  1  1  1  1  1  1  1
 1  1  1  1  1  1  1  1  1  1
 1  1  1  1  1  1  1  1  1  1

In my particular case, because of 1), a substantial part of my codes are broken in CUDAv3.5.0.

Many thanks!