I hit what I think is a surprising performance-snag for custom array-wrapper types that subtype AbstractArray
.
The issue is best illustrated by example: suppose I define a thin wrapper over a vector, like so:
import Base: size, getindex, IndexStyle
# define a wrapper type over a vector
struct V{T} <: AbstractVector{T}
x::Vector{T}
end
size(v::V) = size(v.x)
Base.@propagate_inbounds getindex(v::V, i::Int) = v.x[i]
IndexStyle(::Type{<:V{T}}) where T = IndexStyle(Vector{T}) # ... IndexLinear()
Then I would had hoped that this wrapper would perform pretty much as well as the underlying vector when I iterate over it, i.e. I had hoped the performance of each of these functions would be identical:
# iterate over this wrapper directly
function f(v)
s = zero(eltype(v))
for vᵢ in v # <--
s += vᵢ
end
return s
end
# same thing, but iterate over the underlying vector instead
function g(v)
s = zero(eltype(v))
for xᵢ in v.x # <--
s += xᵢ
end
return s
end
This is true sometimes, e.g. for Float64
element types
using BenchmarkTools
v_float = V(rand(100000))
@btime f($v_float) # 117.999 μs
@btime g($v_float) # 117.999 μs --- everything performs the same; super!
… but, surprisingly, not for Int
element types:
v_int = V(rand(1:10, 100000))
@btime f($v_int) # slow: 40.499 μs
@btime g($v_int) # fast: 14.399 μs
… what is going on here?