Indexing with `CartesianIndex`

Consider this code:

julia> 1[]
1

julia> 1[CartesianIndex()]
1

julia> Ref(1)[]
1

julia> Ref(1)[CartesianIndex()]
ERROR: MethodError: no method matching getindex(::Base.RefValue{Int64}, ::CartesianIndex{0})

Is there a reason why sometimes CartesianIndex() indexing for 0-dimensional containers is sometimes allowed and sometimes it is not allowed?

PS. As usual this is DataFrames.jl related (see https://github.com/JuliaData/DataFrames.jl/pull/1890/files#diff-57cbbe702db7443003fc8c590f239d70R173 where I initially wanted to use CartesianIndex() but it failed in some cases).

1 Like

CartesianIndex indexing only works by default for arrays. We’ve explicitly enabled it for some other array-like objects, like Numbers and getindex(t::Tuple, i::CartesianIndex{1}). Ref is probably another good candidate.

The reason it’s sometimes allowed and sometimes not is simply that not all indexable collections are multidimensional or array-like at all, so we can’t do it by default for everyone. For example, it doesn’t make sense to do anything special with CartesianIndex for dictionaries. Thus we need to decide on a case-by-case basis if the non-array thing should act array-like… and how much so.

Tuples are a good example. They support some array-like things — including one-dimensional cartesian indices and non-scalar slicing — but they don’t support t[] or t[CartesianIndex()] or t[:,:]. All of those things would work with t = [1] but don’t with t = (1,). Deciding exactly where that line should land can be a tough call.

4 Likes

I was just chatting to @andyferris about this very thing yesterday. He mentioned in passing that he wanted multidimensional dictionary like objects to work. He had some interesting motivation related to tensor networks but I admit to not following it as well as I should :slight_smile:

1 Like

I get your point. My idea was rather of a contract like:

If something has axes then it should be indexable using CartesianIndex. And Ref is an example of such case, when this contract currently breaks.

1 Like

I don’t think having a method (axes(::X)) imply that some other method exists or is applicable (getindex(::X, ::CartesianIndex)) is a usual convention in Julia.

Interfaces are usually defined with abstract types or traits.

My point is exactly that axes is a part of AbstractArray interface. If you define axes but do not define CartesianIndex indexing then you are not fully implementing the contract of the interface.

The contract for axes in Base is:

Return the tuple of valid indices for array A.

so it should not be defined for an object that is not an array (or pretends to be an array by fulfilling the contract).

Of course you are allowed to do this in Julia, but I would think that in Base we should be consistent.

Now why it is important - broadcasting by default uses CartesianIndex then some object is allowed to be broadcasted over (as it supports axes), but later fails if it does not support CartesianIndex indexing. This is exactly the case for Ref currently.

My point was that defining a method associated with an interface does not imply that a particular interface applies — it has to be made explicit by either traits or abstract types.

Eg think about size — having it defined does not mean that the abstract array interface applies (Base has examples of this for types which implement iteration, but not the array interface).

However, I think that in the specific case Ref could implement the method you suggested.

I agree with you, it is just a matter of consistency in Base.

Just for a reference from my code n DataFrames.jl and the issue it raises. bcf in it is a Broadcasted{AbstractArrayStyle{0}} object:

v = bcf.f(getindex.(bcf.args)...)

works but

v = bcf.f(getindex.(bcf.args, Ref(CartesianIndex()))...)

does not work.

And you assume that if something is in the bcf.args collection (and passed earlier tests) it should be array-like and these two codes should be equivalent.

Yes I think the question is of consistency - why define axes if you never support multidimensional indexing? What is the semantics of the result? What can I do with x where x = axes(y)? Is this meant to be a broadcast thing?

I think its disingenious to say that axes itself is not an interface. Sure, it’s a small part of the larger AbstractArray interface but it should provide guarantees of its own (it returns a Tuple, for example, the length of the tuple determines the dimensionality of the container which probably shouldn’t disagree with ndims, etc, etc). @Tamas_Papp in a duck-typed world we don’t always want to rely on abstract types and we haven’t yet fleshed out a set of traits to define most the interfaces in Base (AFAICT we can’t even describe the majority of variances of behavior of the AbstractArrays provided by Base and the standard libs).

Anyway, as @c42f alluded to, yes I think there might be value to gain by thinking about how multidimensional indexing fits into containers beyond AbstractArray, what traits we use to describe the capabilities of containers in general, etc.

3 Likes

Just to clarify: I think that in this particular case, Ref should support the whole AbstractArray interface, even if technically it is not a subtype.

In general, however, I don’t think that relying on the presence of various methods is a good strategy to determine if something follows an interface, because the set of methods for interfaces can overlap (eg size).

And yes, this implies that we need to be more careful about defining and documenting interfaces (eg allowing queries with traits). The last big change in this direction was #11794, which was a great improvement, but it does not have to be the last one. A lot of implicit interfaces in Julia are underdocumented, which makes it difficult to rely on and extend them.

1 Like

Ref should support the whole AbstractArray interface

This seems somewhat reasonable (but I note we are reluctant with other fundamental types like Tuple).

Going from the opposite direction… I’ve been curious about the question of if we can make the binary representation of Array{0,T} to be the same as Ref{T}? (Since Array{0,T} already seems like a good place to store and manipulate a single value, but the flexible higher-dimensional array representation in C is a little “heavier” than seems strictly necessary in zero-D).

I would prefer a separate wrapper type, eg Scalar{T}. IMO reusing Ref for this purpose is just confusing.

Sorry, I didn’t understand. Can you expand?

I think that Ref was designed for a purpose that is unrelated to broadcasting, and its use for broadcasting is not ideal (basically, a pun).

I would prefer the introduction of a wrapper type, eg

struct Scalar{T}
    value::T
end

that would take its place. Whether to introduce surface syntax for it (eg &value, as proposed in #27608, is an orthogonal issue). Ideally, users who don’t use it for its original purpose would never ever encounter Ref.

As for 0-dimensional arrays, I think it is a cute corner case and we should support them for consistency, but would prefer to have a wrapper type that signals intent instead of using them for wrapping “scalars”.

2 Likes

It wasn’t orthogonal at all in terms of the design decision. It was a proposed horse trade that never happened — we wanted the &x syntax to work equally well for both ccall and broadcast, and the best way to make sure both of those things could use that syntax was to use Ref for both.

1 Like