(why) is there no Abstract Collection type

Hi everyone,
A common code pattern seems to be:

function thingthatiterates(x, collection::AbstractArray) 
  for i in collection
    ... 
  end
end

Now in some cases, it might be better to use something non-allocating instead of giving a vector, like a tuple or a range.
Ideally, I would like to have an abstract type, say abstractCollection, that has all these as it subtypes so that I could dispatch on it.
However, I do not seem to find anything like that.
I am aware that this can be somewhat solved by unions, but that seems less flexible.
Am I just unaware of such a type existing or is there a reason why it is not there?

The absence of such a type is easily verified:

julia> supertype(AbstractArray)
Any
1 Like

To use the function with something non-allocating, just omit the type annotation of the argument:

function thingthatiterates(x, collection) 
  for i in collection
    ... 
  end
end

The question perhaps is rather, what is the AbstractArray for?

You would define a method thingthatiterates(x, collection::AbstractArray) for example, if a special version of thingthatiterates should be available, that makes use of features of AbstractArrays that other collections don’t have, like getindex:

function getthirdelement(collection::AbstractArray)
    collection[3]
end
function getthirdelement(collection)
       for it in Iterators.take(collection, 3)
            y=it
       end
       y
end
1 Like

For all intents and purposes, AbstractArray is that type. The minimum required interface is just size (so we know how large the thing is) and getindex (so we know how to get things out).

The manual currently lists setindex! as a requirement too, but that has been removed, since Base itself has types that subtype AbstractArray that aren’t mutable, like ranges.

With hindsight, perhaps it would have been better to name that type AbstractCollection, but we can’t change that now and are stuck with it.


Notably, your for loop example is not a feature of collections, but of the protocol on iteration. The two are distinct interfaces, and since julia does not have multiple subtyping (there’s only one parent type), we can’t easily say “I want to index and I want to iterate”.

5 Likes

There are lots of collections even in Julia Base that aren’t AbstractArrays. Most notably, Tuples, NamedTuples, Dicts, Sets.

4 Likes

Yes, that’s part of the awkwardness of Abstractarray - there are so many optional methods that functionally split this type into multiple subtypes (without giving them a name) that it’s not really appropriate to truly make it an AbstractCollection instead.

For what it’s worth, even the mighty Rust doesn’t have such functionality[1] and it’s being potentially added with the generic associated types proposal.

The simple approach to dealing with this issue is duck typing, like Python and iterable, or Julia’s “just leave it untyped and throw something if you really care” approach. Duck typing is cool, but it scales poorly in the number of methods and types (as I understand). Julia’s approach is not ideal, but I’ve found that in cases where I want to type a function in such a way i.e. x::AbstractCollection, that it’s better to force the users to conform to your types so you can write more optimised code. I know it’s a huge cop out, but I’ve honestly not missed it.

Sukera’s reply addresses this well:


  1. Please be kind to me Rustaceans, I only started reading the book 2 days ago :'). ↩︎

Thanks for the replies

I agree that this is an obvious case where we would use AbstractArray. However, I think there are also others for which it would be perfectly valid to use an even more abstract supertype instead, without falling back to Any. For instance, there might be other methods which do not take a vector or a tuple at all which we would like to use for multiple dispatch.
There is the aspect of self-documentating code, personally I find it easy to just use autocompletion in the repl to show me what types the arguments of functions have to remember their correct order.

[quote=“jacobusmmsmit, post:7, topic:89233”] I’ve found that in cases where I want to type a function in such a way i.e. x::AbstractCollection, that it’s better to force the users to conform to your types so you can write more optimised code.
[/quote]
I disagree here. Julia’s strength is that it allows us to use flexible, generic functions, and type-specific optimization is done by the compiler.
Instead of sacrificing this I would rather use a union type in such a case.
Even if it’s not about making the life of users easier, there are definitely cases where it is better to provide the function with a mutible array and some where a tuple makes more sense. Sometimes you are only thinking of a few items in the collection so a tuple makes sense, but you never know, maybe later you need this function to be valid when the collection has a thousand entries

You might be right, collection is probably not the best word here. I think I do not quite understand the second part. What I am thinking of is not a type that has two parents but rather a supertype, in the sense of

abstract type AbstractIterable end
abstract type AbstractArray<: AbstractIterable end
abstract type Tuple<: AbstractIterable end #Or whatever is the most abstract tuple type

In principle this should be pretty much in the spirit of Julias design. I can see that at this stage it would probably not a good idea to add this to the language, though.

1 Like

See also Abstract iterator type? · Issue #23429 · JuliaLang/julia · GitHub

3 Likes

The trouble here is that maybe being iterable is not the defining feature of an AbstractArray and hence subtyping it is not really the truth. We can have infinitely sized arrays after all, which you could iterate, but probably don’t want to for sake of running out of physical time before iteration finishes. Since we don’t have multiple subtyping (i.e., I can’t say “subtype AbstractArray” and also “subtype AbstractIterable” for the same type - only one supertype is allowed), we also can’t delegate the subtyping issue to the subtypes of AbstractArray.

It’s not that this is a bad idea or anything - just that adding that is really, REALLY hard. There’s a reason things similar to this are literally the oldest open (and last single digit) issue on the repo.

1 Like

According to the manual, these are the methods a type should implement in order to be a collection:

  • isempty
  • empty!
  • length
  • checked_length

Note that iteration is not a requirement for a type to represent a collection.

As others have mentioned, things like collections and iterables are defined via interfaces, not via abstract types. What would be nice to have in the language is formal protocols/interfaces and/or a formal trait system.

2 Likes

That section is confusing. A Tuple, or more generally any immutable “collection”, is not a collection by this definition (No method for empty!). So these methods don’t seem to capture the concept of a general collection… maybe an immutable collection.

At the bottom of that section is a list of types under “Fully implemented by”. The list includes several immutable types. You can check that empty! throws a MethodError as expected. So maybe “fully implemented by” refers only to the last function checked_length? The list of types is not in the same box as checked_length, so the formatting needs to change if this is the case. But there is a method Base.checked_length(r) = length(r). So everything that implements length also implements checked_length. So it doesn’t make sense that “fully implmented…” refers only to checked_length. (well, you could opt out by explicitly throwing a MethodError, which is sometimes done.).

At a minimum this section should be clarified.
EDIT: Opened issue 47456

2 Likes