Discussion on "Why I no longer recommend Julia" by Yuri Vishnevsky

martin.d.maas · May 17, 2022, 11:00pm

That’s fair. I meant that incorporating OffsetArrays as some sort of “standard feature” in the official documentation is probably a bad idea. What the LinearAlgebra package is doing is my preferred solution to ensure correctness and turn all these silent bugs into errors, rather than modifying every loop which starts at 1.

stillyslalom · May 17, 2022, 11:10pm

Not to get too off-track, but regarding your question about generic matrix iteration, you can use axes to get valid indices for a given array. To index into the diagonal, you could do something like this:

julia> v = OffsetArray(zeros(3, 4), OffsetArrays.Origin(0, 0));

julia> for (i, j) in zip(axes(v)...)
           v[i, j] += 1
       end

julia> v
3×4 OffsetArray(::Matrix{Float64}, 0:2, 0:3) with eltype Float64 with indices 0:2×0:3:
 1.0  0.0  0.0  0.0
 0.0  1.0  0.0  0.0
 0.0  0.0  1.0  0.0

ElOceanografo · May 17, 2022, 11:25pm

To get the diagonal indices, you can do:

julia> using LinearAlgebra

julia> diagind(randn(5,5))
1:6:25

Mason · May 17, 2022, 11:25pm

martin.d.maas:

To support OffsetArrays, I should do:
for I ∈ CartesianIndices(A)
    A[I[1],I[2]] = sin(x[I[1]]) * cos(y[I[2]])
end
Which is ok. But right now, I don’t know how to traverse the diagonal of a matrix in a way that supports OffsetArrays.

That sorta works, but is not the best way to do it. What I’d recommend instead is to use the axes function.

for i ∈ axes(A,1), j ∈ axes(A,2)
    A[i,j] = sin(x[i]) * cos(y[j])
end

and for the diagonal, it’s just

@assert axes(A, 1) == axes(A,2)
for i ∈ axes(A,1)
    A[i,i] = A[i,i] + 1
end

Zach_Christensen · May 18, 2022, 12:00am

API · ArrayInterface may be of interest

baptnz · May 18, 2022, 12:13am

Coincidentally, today’s interview of Leslie Lamport in Quanta has some interesting points regarding correctness, and computer tools to achieve it.

martin.d.maas · May 18, 2022, 12:40am

Thanks, axes is a nice function.

It’s funny, though, that axes(A,1) seems to be itself a 1-based array.

So, if I want to access all but the boundary points in a matrix built with OffsetArrays, I need to use something like A[axes(A,1)[2:end-1],axes(A,2)[2:end-1]], which would be simply A[2:end-1,2:end-1] if we were just sticking with 1-based indexing.

Clearly, supporting OffsetArrays comes at a cost. On the other hand, I see no benefits to it. But maybe we are getting a little off-topic and should perhaps continue this discussion of OffsetArrays on a separate thread.

Returning to the original subject, maybe the general issue here is not so much a “design” problem, but simply supporting too much abstraction in Julia base, like if we were trying to please everyone (i.e. “Julia is 1-based indexed by default, but you can use any indexing you like with AbstractArrays”). I would rather favor a more “opinionated” standard that just sticks to 1-based Arrays.

Benny · May 18, 2022, 12:43am

Except for a handful of mentions, the documentation actually goes out of its way to avoid saying OffsetArrays, even in a page clearly about it. Part of the reason is that the AbstractArray interface isn’t supposed to support OffsetArrays specifically, it’s just supposed to provide generic interface methods for indexing (for example, axes isn’t necessarily 1-based). The page also gives you an out with a couple methods that enforce 1-based indexing. However, this does mean there’s a discrepancy between the documented interface and the existing AbstractArray code in Base and packages like StatsBase that enforce 1-based indexing, and you are right that one way to resolve that is to redefine the interface as 1-based. But it seems that Julia is going in the other direction to facilitate some unnamed algorithms, just not as quickly as OffsetArrays users hoped.

Mason · May 18, 2022, 12:59am

No you don’t need to do that. You can simply write

A[begin+1:end-1, begin+1:end-1]

Just like how end lowers to lastindex(A, n), begin will lower to firstindex(A, n) inside an indexing expression.

JeffreySarnoff · May 18, 2022, 1:06am

On the composability of packages that have not met.

It is easier to accomplish for focused interoperability than once was the case.
It takes less time to do well than it would working in other languages I know.

It is hard to accomplish seamlessly
– it takes more code than one expects
– that requires more tests than one would prefer
– it requires that others care enough to raise issues

To do so is gratifying.

Elrod · May 18, 2022, 1:06am

For example, relevant to the axes discussion:

indices is useful. It’s like axes, except allows multiple arguments (like eachindex) so that it can check that the axes match.

indices((A,B),(2,1)) is like

ax = map(axes, (A,B), (2,1))
fax = first(ax)
all(==(fax), Base.tail(ax)) || throw()
return fax

In behavior, except it’ll also promote static size information.

Benny · May 18, 2022, 1:41am

Also worth bearing in mind that there are also situations when it’s perfectly fine to not use the generic interface methods, which can have a far more intimidating vocabulary (firstindex, axes, indices, eachindex vs LinearIndices vs CartesianIndices) than a more straightforward 1:size(A,1). It just so happens that the blog addresses a situation where code should be more generic to compose with something the documented interface nominally supports.

If you’re writing a function that takes in a file and writes to another file, you’re completely free to make any internally allocated and freed arrays entirely 1-based. If you are writing a script that doesn’t need to be very composable, you can simply document its 1-based indexing or any other limitations. A benefit of using more numbers and fewer methods is that it’s easier to share with colleagues who may not have the time to learn Julia enough to recognize the methods. Do keep the self-explanatory syntax and methods like size/begin/end.

suavesito · May 18, 2022, 5:35am

I have a quite complicated but completely real use case of OffsetArrays, at least 0-indexed arrays. The problem I face is a research project for something related to Clifford Algebras.

Basically the researcher that I work with though, along with a PhD student, about a representation of Spinors based on a binary representation with a base over {\mathbb{C}^2}^{\lfloor n/2 \rfloor} that works remarkably well when working with operators of Clifford Algebras.

Given the binary representation, the basis of spinors of an n dimensional space is represented by the integer numbers in the range [0, 2^{\lfloor {n/2}\rfloor}-1], and the coefficients are just complex. So a spinor ends up being just an array with indexes beginning at zero, filled with complex numbers. The beauty of the base they choose is that in the most interesting cases for the work of the researcher, most of the coefficients ends up being zero. So you really only need to use sparse arrays.

You could of course use 1-based index and do all the calculations of the basis putting manually the offsets every time you are going to index, but it is a lot easier (and readable) to just have the indexes start at 0. The researcher is not by any means a programmer, but I promise him that he would be able to write code that would make sense, be fast and easy to understand for him. So I need to use and work with 0 indexed arrays.

ImreSamu · May 18, 2022, 6:57am

in some fields: the Zero-based numbering is more optimal than the 1-based
and there are a lot of examples in the Wikipedia article:

Zero-based numbering - Wikipedia

the other problem is the fast and optimal Interoperability with the other 0-based languages.

Comparison of programming languages (array) - Wikipedia

so no easy answer.

pablosanjose · May 18, 2022, 10:50am

The general sentiment of the blog post resonates with me too. Julia permissiveness is the key to generic composability, but without well-defined mechanisms to establish contracts/interfaces, one can just hope for the best, and fear the worst when mix and matching packages that were not designed explicitly to work together. I do often feel some insecurity when practicing generic programming in Julia, because the potential complexity/behavior landscape that opens up is just too vast to keep in my head.

I (want to) think, however, that this Julia weakness is not fatal. Making the language more strict/correct can be done by imposing constraints and mechanisms for type contracts in future Julia versions, I would expect?

@ColinCaine mentioned above one way that, at least at first sight, looks quite interesting. A package such as AbstractArrayInterface.jl could perhaps be created that (1) documents the full interface of a Base abtract type and (2) exports a test_interface(::AbstractArray) that serves as an official and exhaustive test that any new implementation of AbstractArray should pass. This would go a long way towards making interoperability more predictable.

This doesn’t solve wrong uses of the very unsafe @inbounds in the wide ecosystem, which is a separate issue, I’d say. To fix that we might need compiler improvements that makes @inbounds unnecessary in most cases (as mentioned in @suavesito 's issue deprecate `@inbounds` and change it to `@unsafe_inbounds` · Issue #45342 · JuliaLang/julia · GitHub), with a deprecation into a no-op at some point.

[To be completely honest, the above is in fact me hoping that this (quite real) problem can be solved in the future. The progress made in the language has been astounding thus far. I have not been able to find any realistic alternative to Julia for numerical computing (interactive, fast, expressive, abstract, complete), so I do try hard to remain optimistic when discussing its weaknesses.]

Zach_Christensen · May 18, 2022, 11:14am

It would be great if we could do something like IsInboundsIndices(context, indices) where context could somehow reference a collection without a performance hit for storing that extra info or looking it up. However, until we can develop something like that I see no issue with the existence of @inbounds. All tools can be dangerous/inappropriate when used incorrectly.

gottacatchenall · May 18, 2022, 1:30pm

Agree w/ previous responses. They “correct” implementation would be using eachindex(a) instead of 1:length(a), which won’t break if using OffsetArrays.jl

martin.d.maas · May 18, 2022, 8:04pm

Benny:

Except for a handful of mentions, the documentation actually goes out of its way to avoid saying OffsetArrays, even in a page clearly about it . Part of the reason is that the AbstractArray interface isn’t supposed to support OffsetArrays specifically, it’s just supposed to provide generic interface methods for indexing (for example, axes isn’t necessarily 1-based). The page also gives you an out with a couple methods that enforce 1-based indexing. However, this does mean there’s a discrepancy between the documented interface and the existing AbstractArray code in Base and packages like StatsBase that enforce 1-based indexing, and you are right that one way to resolve that is to redefine the interface as 1-based. But it seems that Julia is going in the other direction to facilitate some unnamed algorithms, just not as quickly as OffsetArrays users hoped.

Yes, it appears that Julia is moving in that direction – 1-based indexing is just the default, but the plans are to offer broad support for generic arrays. The problem I is that this transition seems only half-way done.

Going back to the documentation, where I see there is a lot of work to be done, that page you link is a great resource. However, I think it’s pretty hard to find, located in:

Developer’s documentation > Documentation of Julia’s internals > Arrays with custom indices

To avoid problems like this in the future, the best practice regarding generic arrays should be probably summarized and presented much earlier to users. This should include a handful of functions that suffice to solve the most common indexing tasks. Thanks to what has been discussed here, I believe eachindex, axes, begin, and end pretty much do the job, and that there could be an occasional need for firstindex and lastindex. Then, there are more advanced indexing techniques that could be documented separately.

The other big deal to consider in this discussion is how we do integration testing. On one hand, given the generic and composable nature of Julia, code coverage is a largely irrelevant metric. It would be interesting to think about new tools, which could perhaps apply macros to a test suite, to test the degree of “genericness” of a package, if that’s possible somehow. Or a tool that could modify existing tests to replace whichever specific type was tested, with the intended one?

fgerick · May 18, 2022, 9:18pm

I think this point highlights how Julia managed to push generic code (or code reusability) to a new level that other languages haven’t managed to and therefore this problem never arised. With great power comes great responsibility.

xfcjscn · May 19, 2022, 3:06am

Regarding to array, I think the issue is that it’s not prominent enough in the lengthy document.
Most developer perceive array as: A[0, 1, 2, …].
Newbee julia developer perceive array as: A[1, 2, 3, …].
Actual julia abstract array is: A[m … n] where m, n in Z.

Obviously there is a gap, and the gap is not obvious enough to new julia developers.