Type instability with CuVector inside struct

this is a very simple problem, so i fear i’m making a dumb mistake. apologies if so. but i can’t figure out why the function foo below is type unstable when input with a struct that has a CuVector in it, whereas if it’s a regular Vector it is fine. thx.

julia> using CUDA

julia> struct Foo
           x::Vector{Float64}
       end

julia> foo = Foo(ones(3))
Foo([1.0, 1.0, 1.0])

julia> struct CuFoo
           x::CuVector{Float64}
       end

julia> cu_foo = CuFoo(ones(3))
CuFoo([1.0, 1.0, 1.0])

julia> function fun(s)
           s.x .= 0.0
       end
fun (generic function with 1 method)

julia> @code_warntype fun(foo)   # everything here is type stable
MethodInstance for fun(::Foo)
  from fun(s) in Main at REPL[6]:1
Arguments
  #self#::Core.Const(fun)
  s::Foo
Body::Vector{Float64}
1 ─ %1 = Base.dotgetproperty(s, :x)::Vector{Float64}
│   %2 = Base.broadcasted(Base.identity, 0.0)::Core.Const(Base.Broadcast.Broadcasted(identity, (0.0,)))
│   %3 = Base.materialize!(%1, %2)::Vector{Float64}
└──      return %3

julia> @code_warntype fun(cu_foo)   # this is NOT type stable :(
MethodInstance for fun(::CuFoo)
  from fun(s) in Main at REPL[6]:1
Arguments
  #self#::Core.Const(fun)
  s::CuFoo
Body::CuArray{Float64, 1}   ### RED  RED  RED
1 ─ %1 = Base.dotgetproperty(s, :x)::CuArray{Float64, 1}   ### RED  RED  RED
│   %2 = Base.broadcasted(Base.identity, 0.0)::Core.Const(Base.Broadcast.Broadcasted(identity, (0.0,)))
│   %3 = Base.materialize!(%1, %2)::CuArray{Float64, 1}   ### RED  RED  RED
└──      return %3

julia> using MethodAnalysis

julia> methodinstances(fun)
2-element Vector{Core.MethodInstance}:
 MethodInstance for fun(::Foo)
 MethodInstance for fun(::CuFoo)

this is with julia 1.8.5 and CUDA.jl v4.1.4.

CuVector has additional type variables. But it’s better to parameterize the field so that you’re not tied to the exact definition.

oh right! thanks so much.

as a note to my future self, the extra parameter specifies the storage:

julia> CUDA.ones(3)
3-element CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}:
 1.0
 1.0
 1.0

help?> CuArray
search: CuArray CuArrayPtr AnyCuArray DenseCuArray StridedCuArray CuDeviceArray CuTextureArray

  No documentation found.

  Summary
  ≡≡≡≡≡≡≡≡≡

  mutable struct CuArray{T, N, B}

  Fields
  ≡≡≡≡≡≡≡≡

  storage :: Union{Nothing, CUDA.ArrayStorage{B}}
  maxsize :: Int64
  offset  :: Int64
  dims    :: Tuple{Vararg{Int64, N}}

  Supertype Hierarchy
  ≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡

  CuArray{T, N, B} <: GPUArraysCore.AbstractGPUArray{T, N} <: DenseArray{T, N} <: AbstractArray{T, N} <: Any

so my above script could be re-written as:

julia> using CUDA

julia> struct Foo{T}
           x::T
       end

julia> foo = Foo(ones(3))
Foo{Vector{Float64}}([1.0, 1.0, 1.0])

julia> cu_foo = Foo(CUDA.ones(3))
Foo{CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}(Float32[1.0, 1.0, 1.0])

julia> function fun(s)
           s.x .= 0.0
       end
fun (generic function with 1 method)

julia> @code_warntype fun(foo)
MethodInstance for fun(::Foo{Vector{Float64}})
  from fun(s) in Main at REPL[5]:1
Arguments
  #self#::Core.Const(fun)
  s::Foo{Vector{Float64}}
Body::Vector{Float64}
1 ─ %1 = Base.dotgetproperty(s, :x)::Vector{Float64}
│   %2 = Base.broadcasted(Base.identity, 0.0)::Core.Const(Base.Broadcast.Broadcasted(identity, (0.0,)))
│   %3 = Base.materialize!(%1, %2)::Vector{Float64}
└──      return %3

julia> @code_warntype fun(cu_foo)
MethodInstance for fun(::Foo{CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}})
  from fun(s) in Main at REPL[5]:1
Arguments
  #self#::Core.Const(fun)
  s::Foo{CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}
Body::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}
1 ─ %1 = Base.dotgetproperty(s, :x)::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}
│   %2 = Base.broadcasted(Base.identity, 0.0)::Core.Const(Base.Broadcast.Broadcasted(identity, (0.0,)))
│   %3 = Base.materialize!(%1, %2)::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}
└──      return %3

in my particular use case, i wanted to supply the struct with defaults using Base.@kwdef. not sure how to get around using CUDA.Mem.DeviceBuffer explicitly, but this works for me:

julia> N=3
3

julia> Base.@kwdef struct Bar{T}
           x::T = Vector(undef, N)
       end

julia> bar = Bar{Vector{Float64}}()
Bar{Vector{Float64}}([0.0, 1.63e-322, 0.0])

julia> cu_bar = Bar{CuVector{Float64, CUDA.Mem.DeviceBuffer}}()
Bar{CuArray{Float64, 1, CUDA.Mem.DeviceBuffer}}([-8.826581242931566e212, 4.94e-322, 0.0])