How to implement efficient custom array

stemann · March 22, 2019, 10:07am

How to implement a custom StridedArray or DenseArray?

I have some C-code which returns a pointer to an array, which I can wrap using unsafe_wrap. I need to build a custom array-implementation, but operations on this custom array is a lot slower than on the array (wrapped or not):

abstract type AbstractFooArray{T,N} <: DenseArray{T,N}
end

mutable struct FooArray{T,N} <: AbstractFooArray{T,N}
    array::DenseArray{T,N}
end

Base.IndexStyle(::Type{<:AbstractFooArray{T,N}}) where {T,N} = IndexLinear()
Base.getindex(a::AbstractFooArray{T,N}, i::Int) where {T,N} = a.array[i]
Base.setindex!(a::AbstractFooArray{T,N}, v, i::Int) where {T,N} = a.array[i] = v

Base.size(a::AbstractFooArray{T,N}) where {T,N} = size(a.array)

#Base.strides(a::AbstractFooArray{T,N}) where {T,N} = strides(a.array)
#Base.unsafe_convert(::Type{Ptr{T}}, a::AbstractFooArray{T,N}) where {T,N} = unsafe_convert(Ptr{T}, a.array)
#Base.has_fast_linear_indexing(a::AbstractFooArray{T,N}) where {T,N} = true

a = zeros(UInt8, (3, 2448, 2048));
b = unsafe_wrap(Array, pointer(a), size(a));

a_f = FooArray(a)
b_f = FooArray(b)

@time sum(a)
@time sum(b)

  0.002121 seconds (4 allocations: 160 bytes)
  0.002241 seconds (4 allocations: 160 bytes)

@time sum(a_f)
@time sum(b_f)

  0.562584 seconds (16 allocations: 400 bytes)
  0.546441 seconds (16 allocations: 400 bytes)

The reported numbers are after compilation warm-up (second run).

mauro3 · March 22, 2019, 10:11am

There might be other things but a biggy is that DenseArray is an abstract type and thus having a type field with that type is slow. See performance tips in the docs.

stemann · March 22, 2019, 11:59am

Thanks - that helped a lot - brought the gap to a factor ~ 5:

  0.002038 seconds (4 allocations: 160 bytes)
  0.002234 seconds (4 allocations: 160 bytes)

  0.010350 seconds (4 allocations: 160 bytes)
  0.009492 seconds (4 allocations: 160 bytes)

StefanKarpinski · March 22, 2019, 12:19pm

Maybe you don’t need a custom array type? You can create a Julia array that is backed by C-allocated memory. Unless you need different behavior a new type is unnecessary.

stemann · March 23, 2019, 7:41am

True, but in this case I would like to add reference counting with a custom dispose function to enable re-use of large arrays (returning it to a pool of arrays from which it can be re-used):
https://github.com/IHPSystems/ResourcePools.jl/tree/feature/pooled_resource/src

Topic		Replies	Views
Help with reducing allocations when indexing into custom Array Performance array , memory-allocation , speed-optimization	10	159	January 14, 2025
Custom Array Performance Issues General Usage	8	444	September 30, 2019
Iteration/getindex performance of AbstractArray wrapper-types Performance	2	461	January 21, 2021
How to Sensibly Implement Generic Slice Indexing for Custom Array Type General Usage question , gpu , performance , design	3	334	January 29, 2024
Type instability with SubArray on custom Array type Performance	2	471	June 20, 2018

How to implement efficient custom array

Related topics