Can I make indexing an AbstractVector{Float64} return a Float64?

Hey all!

I’m trying to create an iterator over an array right now that only returns certain elements. The definition of the type looks something like this (all code will be at the bottom):

struct DwellIterAbstract
    data::AbstractVector{Float64}
    dlast::Int
    x_lo::Float64
    x_hi::Float64
end

I have a method which I apply to the iterator like so (it’s a little verbose because it makes it easier to read @code_warntype output):

function is_in_range(dwi::DwellIterAbstract, i)
    elem = dwi.data[i]
    b1 = dwi.x_lo <= elem
    b2 = elem <= dwi.x_hi
    return b1 && b2
end

Intutively, I expect that dwi.data[i] should be Float64, because dwi.data is AbstractVector{Float64}. When I run @code_warntype on this method however, I get the following:

@code_warntype is_in_range(dwi_abs, 10)
MethodInstance for is_in_range(::DwellIterAbstract, ::Int64)
  from is_in_range(dwi::DwellIterAbstract, i::Integer) in Main at /mnt/wdblack-4T/Experiments/10-2022/splitting-on-ring/code/splitting.jl:18
Arguments
  #self#::Core.Const(is_in_range)
  dwi::DwellIterAbstract
  i::Int64
Locals
  b2::Any
  b1::Any
  elem::Any
Body::Any
1 ─ %1 = Base.getproperty(dwi, :data)::AbstractVector{Float64}
│        (elem = Base.getindex(%1, i))
│   %3 = Base.getproperty(dwi, :x_lo)::Float64
│        (b1 = %3 <= elem)
│   %5 = elem::Any
│   %6 = Base.getproperty(dwi, :x_hi)::Float64
│        (b2 = %5 <= %6)
└──      goto #3 if not b1
2 ─      return b2
3 ─      return false

Notably, all of elem, b1, and b2 are of type Any. This method gets called a lot in my code, and the allocation alone is sinking my performance (I assume because these Any values have to be boxed). This seems strange to me because my intuition says that indexing an AbstractVector{Float64} should definitely return a Float64, so I’m not sure why it’s giving me Any.

If I modify this struct so that data is of type Vector{Float64}, then the three locals are typed as a Float64 and two Bools, but then I lose the ability to use my code on views, or SharedArrays.

Question: Have I misunderstood what the AbstractArray interface promises? Is there any way I can get dwi.data[i] to result in a Float64?

Full code:

struct DwellIterAbstract
    data::AbstractVector{Float64}
    dlast::Int
    x_lo::Float64
    x_hi::Float64
end

struct DwellIterConcrete
    data::Vector{Float64}
    dlast::Int
    x_lo::Float64
    x_hi::Float64
end

function is_in_range(dwi::DwellIterAbstract, i::Integer)
    elem = dwi.data[i]
    b1 = dwi.x_lo <= elem
    b2 = elem <= dwi.x_hi
    return b1 && b2
end

function is_in_range(dwi::DwellIterConcrete, i)
    elem = dwi.data[i]
    b1 = dwi.x_lo <= elem
    b2 = elem <= dwi.x_hi
    return b1 && b2
end

In REPL:

include(<file.jl>)
dwi_abs = DwellIterAbstract(zeros(10_000), 10_000, -1.0, 1.0)
dwi_con = DwellIterConcrete(zeros(10_000), 10_000, -1.0, 1.0)
@code_warntype is_in_range(dwi_abs, 10)
@code_warntype is_in_range(dwi_con, 10)

Not exactly. What you seem to be misunderstanding is that a semantic guarantee from the AbstractArray interface will translate to a correct type inference in your code. Julia cannot “see” this guarantee in a way that allows it to always correctly infer the type (and anybody can actually implement a subclass that breaks the contract). If you want to guarantee good inference, you need to annotate every use of dwi.data[i] with Float64 (i.e., replace dwi.data[i] with (dwi.data[i] :: Float64)).

In performance critical code, fields should not be given abstract types (see Performance Tips). Instead, you should parameterize your struct allowing the compiler to track the actual concrete type inside:

struct DwellIterParam{T, I, R}
    data::T
    dlast::I
    x_lo::R
    x_hi::R

    function DwellIterParam(data::AbstractVector{R}, dlast::Integer, x_lo::R, x_hi::R) where {R<:Real}
        new{typeof(data),typeof(dlast),R}(data, dlast, x_lo, x_hi)
    end
end

Now, the compiler knows all types of the fields:

julia> DwellIterParam([1,2,3], 10, 1, 2)
DwellIterParam{Vector{Int64}, Int64, Int64}([1, 2, 3], 10, 1, 2)

julia> DwellIterParam([1.0,2.0,3.0], 10, 1.0, 2.0)
DwellIterParam{Vector{Float64}, Int64, Float64}([1.0, 2.0, 3.0], 10, 1.0, 2.0)
4 Likes

the point is compiler can’t know what’s gonna happen given ONLY it’s a <:AbstractVector{T}

julia> struct A{T} <: AbstractVector{T}
           data::T
       end

julia> a = A(3.0);

julia> eltype(a)
Float64

julia> Base.getindex(::A, x...) = 1

julia> a[123]
1

this is not enforced in any way, because Julia doesn’t have interface / contract

1 Like

Sorry, I don’t follow:

  • You define a type A which is a subtype of AbstractVector, i.e., when you construct A(3.0) the compiler obviously has access to its concrete type (namely A):

     julia> typeof(a)
     A{Float64}  # even the eltype is known via the type parameter
    
  • In my example, the type of the whole vector is a type parameter

    julia> param = DwellIterParam([1.0,2.0,3.0], 10, 1.0, 2.0)
    DwellIterParam{Vector{Float64}, Int64, Float64}([1.0, 2.0, 3.0], 10, 1.0, 2.0)
    

    and accordingly, the compiler can track the concrete type of each field in param.

Remember, that any function can only ever be called on concrete types, i.e., abstract types only define the dispatch but do not exist otherwise:

julia> g(x::AbstractVector) = sum(x)
g (generic function with 1 method)

julia> @code_warntype g([1,2,3])
MethodInstance for g(::Vector{Int64})
  from g(x::AbstractVector) in Main at REPL[25]:1
Arguments
  #self#::Core.Const(g)
  x::Vector{Int64}
Body::Int64
1 ─ %1 = Main.sum(x)::Int64
└──      return %1


julia> @code_warntype g([1.0,2.0,3.0])
MethodInstance for g(::Vector{Float64})
  from g(x::AbstractVector) in Main at REPL[25]:1
Arguments
  #self#::Core.Const(g)
  x::Vector{Float64}
Body::Float64
1 ─ %1 = Main.sum(x)::Float64
└──      return %1

This is also the reason why the function barrier technique works.

yes but when the field is data::AbstractArray{T}, the compiler only sees that, and the point is “given data::AbstractVector{Float64}, getindex() can return anything”.

Here, A is an instance of getindex on AbstractVector{Float64} returning Int64

1 Like

Ok, now I get it … another good point why abstract types for fields should be avoided.