Iteration/getindex performance of AbstractArray wrapper-types

tchr · January 21, 2021, 2:59pm

I hit what I think is a surprising performance-snag for custom array-wrapper types that subtype AbstractArray.

The issue is best illustrated by example: suppose I define a thin wrapper over a vector, like so:

import Base: size, getindex, IndexStyle

# define a wrapper type over a vector
struct V{T} <: AbstractVector{T}
    x::Vector{T}
end
size(v::V) = size(v.x)
Base.@propagate_inbounds getindex(v::V, i::Int) = v.x[i]
IndexStyle(::Type{<:V{T}}) where T = IndexStyle(Vector{T}) # ... IndexLinear()

Then I would had hoped that this wrapper would perform pretty much as well as the underlying vector when I iterate over it, i.e. I had hoped the performance of each of these functions would be identical:

# iterate over this wrapper directly
function f(v)
    s = zero(eltype(v))
    for vᵢ in v   # <--
        s += vᵢ
    end
    return s
end
# same thing, but iterate over the underlying vector instead
function g(v)
    s = zero(eltype(v))
    for xᵢ in v.x # <--
        s += xᵢ
    end
    return s
end

This is true sometimes, e.g. for Float64 element types

using BenchmarkTools
v_float = V(rand(100000))

@btime f($v_float) # 117.999 μs
@btime g($v_float) # 117.999 μs --- everything performs the same; super!

… but, surprisingly, not for Int element types:

v_int   = V(rand(1:10, 100000))

@btime f($v_int) # slow:    40.499 μs
@btime g($v_int) # fast:    14.399 μs

… what is going on here?

tchr · January 21, 2021, 3:35pm

@Sukera pointed out on Slack that defining

iterate(v::V, i=1) = iterate(v.x, i)

removes the performance penalty for the Vector{Int} wrapper relative to the underlying array. Needing to do this to get “full” performance is not specifically mentioned in the AbstractArray Interfaces docs; maybe it should be?

mbauman · January 21, 2021, 7:07pm

I’d rather fix the core issue here than document a workaround. It is surely something to do with inlining and/or bounds check since the difference is the SIMD-ifying of the loop or not. Ah, sure enough:

https://github.com/JuliaLang/julia/issues/39354

Topic		Replies	Views
Why the performance hit on this trivial vector wrapper broadcast? General Usage	3	562	June 17, 2019
Sum and other functions on custom subtype of AbstractArray 1000 slower than on normal Arrays General Usage performance	4	520	June 5, 2018
Custom Array Performance Issues General Usage	8	444	September 30, 2019
How to wrap a vector so that it does simd? Performance	0	320	January 8, 2022
Declare vector type or not / views / subarrays Performance	5	774	June 24, 2020

Iteration/getindex performance of AbstractArray wrapper-types

Related topics