Type stable accumulator over heterogeneous collection

question

#1

I am working on a library where I cannot avoid heterogeneous collections, and I want type stable code. The MWE below is a typical (but heavily simplified) example of something I would do (transformations, accumulation), but I find it hard to get the compiler to infer the type.

function next_ix(acc, x::AbstractArray)
    l = length(x)
    acc + l, acc + (1:l)
end

next_ix(acc, x::Number) = acc+1, acc+1

function indexes(xs)
    acc = 0
    map(x -> (((acc, ix) = next_ix(acc, x)); ix), xs)
end

@inferred indexes((1,1,1))           # error, infers Tuple{Any,Any,Any}
@inferred indexes((1,1:2,ones(2,2))) # ditto

The question is twofold: (1) how should I deal with the specific problem above, (2) how to do this in general, so that I get maintainable and easy to read code.

Is it even sensible to try to use basic constructs in Base for this? With @cstjean’s excellent Unrolled.jl, things become very easy:

using Unrolled

@unroll function indexes2(xs)
    acc = 0
    result = ()
    @unroll for x in xs
        acc, ix = next_ix(acc, x)
        result = (result..., ix)
    end
    result
end

@inferred indexes2((1,1,1))
@inferred indexes2((1,1:2,ones(2,2)))

so should I just keep using that everywhere?


#2

Try https://github.com/tkoolen/TypeSortedCollections.jl.
It is fairly new, that’s why I think some feedback is welcome, maybe even a feature request.


#3

Depends on what you’re doing, but the ArrayPartition in RecursiveArrayTools.jl works great for what we do in DiffEq

It’s essentially a wrapper over a tuple of arrays so that broadcasted operations are type-stable. Then indexing in the tuple directly is of course type stable. The general indexing over the whole collection is only type stable if all of the element types are the same of course, but most things we do can just use the special broadcast and get away without having to do full indexing.


#4

Interesting, I wasn’t aware of RecursiveArrayTools before I started writing TypeSortedCollections. Some differences I observed after a quick glance:

  • TypeSortedCollections doesn’t implement broadcast (although I suppose it could). It doesn’t implement the various operators implemented for ArrayPartition either. It does implement map!, foreach, and mapreduce, which aren’t specialized for ArrayPartition.
  • I consciously chose not to implement the full AbstractArray interface (getindex et al.) because that’s necessarily not type stable (except maybe this multi-index version of getindex?).
  • A TypeSortedCollection stores not just a Tuple of Vectors containing the data, but also a tuple of index vectors that map back to the order of the type-heterogeneous input collection from which the TypeSortedCollection was constructed. This allows interactions with plain AbstractVectors that are aligned with the original input data to be indexed correctly) (see example at end of section).