Is it ever logical to have arrays without indexing (diag(A) seems to be such a case, logical conclusion of generic programming)?

It seems like an absurd question, but is the logical conclusion I got to (for Vectors) through, yes, a contrived usage of OffsetArrays, explained here: Add an abstract type signaling e.g. OffsetArray safe to use (also add badges to packages) · Issue #284 · JuliaArrays/OffsetArrays.jl · GitHub

It’s not unheard of disallowing indexing, i.e. for changing an array (see StaticArrays and immutability is often a good thing), but to also have indexing disallowed for looking up in an array seems a bit extreme. It seems like the most basic property you want to have for an array/vector.

I might be wrong about my “logical conclusion” (in math or if Julia where different, but seems so for the status quo), but lets first pretend I’m right.

For a diagonal of an array to neither be logically 1-based nor 0-based (always) doesn’t mean it doesn’t exists, only problematic how you index it.

What do you want to do with the diagonal? The trace comes to mind:

tr(A) = sum(diag(A))

You can define it without referring to any specific index of diag(A).

But as soon as you try to add up, you want something like:

sum(A) = sum = 0; for i in eachindex(A) ...

At this lower level you DO want to index into the array. I suppose the internals of diag(A) could define a 1-based Vector (what it does now), and that sum would have access to it, by piracy, but otherwise you have no (public) access to A[i]. It does however seem to me well-defined to index into the first and last index, and all between, by A[begin+1] etc. but just not referring to a specific Int index (if only this wouldn’t call getindex).

I’m not sure what you do usefully with the diagonal of a matrix, besides trace. And note, Diagonal(A) is (still) always definable, just not diag(A) which gives a Vector

The trend in math is to explore more generality, and it’s not unusual to lose some properties, e.g. complex numbers are more general than reals but can no longer ordered.

I hesitate to put this into the “Internals & Design”-category but though people might still be intrigued.

Note, while in linear algebra and math in general 1-based is the norm sometimes 0-based indexing is used for e.g. matrices. I’m not sure what happens currently in Julia if you try to multiply matrices of those two types, otherwise identical. In math I guess it is well-defined, these are just arbitrary labels, just unstated what the end-result should be 1-based or 0-based. I even thought of unifying regular Julia arrays and OffsetArrays to allow BOTH 1-based and 0-based indexing. It’s not totally outlandish, could work (and there’s a precedent in e.g. PHP allowing two kinds of indexing for arrays, 0-based and by text key). It might solve my diag(A) dilemma.

An array without indexing seems like an “iterator” to me.

(The diagonal of a matrix may not be not a great example, though, because the entries in diagonal are regularly spaced in memory and so one can support random acccess, In fact, we do support an array-like view of the diagonal: @view A[diagind(A)] works just fine via LinearAlgebra.diagind.)

1 Like

Imperative style uses more indexing and loops, while functional style often uses more folds. You can write sum as reduce(+, xs, init=0).

Right, but again for some array would need indexing at some lower level (they’re also appropriate say for linked lists, so I’m not saying arrays/direct indexing always needed).

In the case I had, diag(A) will give you a Vector, so it would be surprising to many not always be able to get direct indexing for it, as usual.

The point was really, as with diag, wouldn’t it be surprising to not have the same full API you (at least I) would expect (or for any array-like object, that you know is there underneath).

However, for a diagonal Vector, I’m not sure what the helpful actual operations are. Besides multiplying it with something, the only thing that comes to mind to me is the trace. I can’t think of anything where I would index some particular element directly, as opposed to indirectly, as with those two (more operations).

Strictly speaking while I mentioned the trace, I see it’s only defined for matrices, i.e. tr(M) or tr(Diagonal(M), but not:

julia> tr(diag(M))
ERROR: MethodError: no method matching tr(::Array{Float64,1})

It doesn’t change the question that much, except it’s maybe not too bad that diag(M) can’t always be defined.

When would diag be used, e.g. for something like this:

julia> M*diag(M)
2-element Array{Float64,1}:
  9.0
 19.0

notably, different from for same M:

julia> M*Diagonal(M)
2×2 Array{Float64,2}:
 1.0   8.0
 3.0  16.0

I didn’t check, maybe sum (and tr) actually only needs the iterator interface.

Check out Dex. It’s early days, but the point is to hide indexing in type-safe type enumeration and have any remaining indexing throw errors at compile time (but doesn’t take the normal dependent typing approach, and allows controlled mutation) :

By exploiting the connection between arrays and memoized typed functions, you can have type safety but still generate all possible values of an array on the fly. So with the right type, you can avoid all or some index arithmetic that would normally be required, and it works on the GPU with the right loop effect type.

It has array shape information at compile time (even with complex shapes), without the complexities of full dependent typing or massive compilation cost. These array types or “index sets” are expressive enough to represent named dims, array of arrays and more, with full type safe indexing and safe/predictable parallel loops.

The dynamic programming levenshtein-distance example exploits this abstraction but still ends up being very close to c++ in runtime (without future compiler optimizations). It is totally type safe and even enforces at compile time that the output table is 1 larger
in each dimension than the inputs.

Compare to a Julia’s implementation which has more branching and error prone index arithmetic.

It would be interesting to see how we might be able to match the concision/expressiveness with a https://github.com/JuliaFolds/FGenerators.jl version…but we’ll never get the same type safety unless we could somehow lift it into the type domain and use JET.jl. Also not sure about SIMD, loop rolling, parallelism, GPU and other optimizations that the index sets approach provides as well.

As of [#876] tables themselves can be used as index sets, letting us build index sets whose dimension is determined at runtime, but still tracked by the compiler. For instance, the type (Fin D)=>letter represents the set of all D-letter strings. And the table for i:((Fin D)=>letter). i instantiates all those strings. For another example, ((Fin D)=>(Fin s))=>Float is a table of Floats with s^D elements.