What's the big deal? 0 vs 1 based indexing

As an engineer coming from a non CS / EE background, I have no strong preference or use for 0 based indexing. In fact, 1 based indexing seems incredibly intuitive to me when I use Julia to work with data and manipulate arrays with loops, etc. for me, this aspect of Julia is a big plus. I find lots of hate though on the web for Julia on this point. Why? Is it just a question of familiarity / code comparability, or are their underlying reasons I am missing?

1 Like

I found since moving away from 1-based indexing (FORTRAN) I much less often needed to worry about +1 or -1 - I don’t think it’s entirely arbitrary. I use python/numpy, and it seems almost ideal. The only defect is:
array[start:-end]
There is a anomaly when end==0. [:-1] will give all but the last element, but [:-0] == [:0] is an empty array. This has occasionally bitten me (when end is a computed variable, and just happens to end up turning into -0).

There is no big deal. [Julia has both, it makes 0-based or arbitrary a little more complex, if you want to support all ways. If you know you only want 1-based, just use 1:end, eitherwise eachindex(my_array) is a little better.]

Programmers make a big dea lof this. All CPUs have a linear address space that starts at 0 [in the past, and on microcontrollers can be more complex…].

In ordinary code you never want to access location 0…, yes, you can also define start of arrays as 0.

This was mostly an issue for eary (simple) compilers. The extra implied -1 you have to do, can and will be optimized away. [Are there any exceptions to that? I think not. At least would be outside of loops, so not slow down.]

1-based indexing is a big deal for same reason most other fad topics are a big deal: it’s such a simple idea that everyone can have an opinion on it, and everyone seems to think they can “help” by telling their personal experience about how this arbitrary choice has affected them at one time in their life.

37 Likes

One interesting kernel, sometimes lost in the flamewars, is that the choice comes down to a preference for counting (1-based) versus offsets (0-based). Mathematically-focused languages understandably care most about counting, so the natural choice is 1-based (Fortran, Matlab, etc.). System languages dealing directly with memory tend to care most about offsets, so have usually been 0-based. Obviously there are counter-examples, but that’s the best explanation I’ve heard.

Also worth pointing out that many people have been working hard on abstractions for Julia that allow considerable flexibility in this area, see: http://julialang.org/blog/2016/03/arrays-iteration

37 Likes

That’s exactly it.
1-based indexing is actual indexing like in mathematics, while 0-based “indexing” isn’t indexing at all but pointer arithmetic. This comes from C where an array is just syntactic sugar for a pointer.

5 Likes

Instead of flaming, what would be best IMO is to simply have the flexibility to handle arbitrary bases, and also arbitrary ordering, which would help a lot interfacing to C/C++/Java/etc. that all have 0-based, row-major memory layouts.
Tim Holy has recently done some great stuff to add that flexibility, with OffsetArrays and PermutedDims, but I think it would be much nicer if that flexibility were integrated into the base Array types in Julia (as Fortran 90 and later have for arbitrary offsets).

2 Likes

This is a simple distinction but so easily overlooked, perhaps due to the terms used for these notions both in programming and in general language. I recently came across a post that explores this issue from a beginner’s point of view (i.e. someone who hasn’t considered this semantic distinction in mathematical terms, which was certainly my case). I think it explains the issue quite well: https://betterexplained.com/articles/learning-how-to-count-avoiding-the-fencepost-problem/. Wikipedia also describes the “fencepost problem” in https://en.wikipedia.org/wiki/Off-by-one_error#Fencepost_error.

2 Likes

Exactly. I like it too.
Personally I think that 0based indexing is one of many mistakes of programming. Because everyone in real world counts from one, 1based is intuitive and saves your intellectual capacity on more important things than doint (+1) constantly in your mind. 0based is more error prone …

3 Likes

0-based indexing actually becomes very natural when you program for long enough and doesn’t use any intellectual capacity at all. In fact, when working with 1-based languages I have to constantly remind myself that they are 1-based and keep messing things up even when I’m actively thinking about it (talk about error-prone).

I think 0-based indexing makes quite a lot of sense in many scenarios, especially when dealing with algorithms and such (getting the root of an array-based binary tree works very easily by dividing by 2, but only on a 0-based array, spatial partitioning becomes easier as well, etc.), not to mention actually working with memory, where the index is really just an offset and adding a 1 to it would only serve to make things messy.

1 Like

It’s just a convention and there isn’t much difference either way. I agree with Dijkstra about 0-based indexing being (slightly) superior, but thankfully Julia allows indexing with an arbitrary base:

julia> using Polynomials

julia> p = Polynomial([3, 0, 2])
Polynomial(3 + 2*x^2)

julia> p[0]
3
1 Like

I feel exactly the opposite. 0-based indexing is a huge mental burden to me.

4 Likes

I found C++ slightly easier to work in when using 0-based indexing, but then I was using n-dim arrays that I was hand indexing.

Pretty much every other programming language I’ve encountered and worked in as a mathematician is 1-based (R, MatLab, Octave, Fortran, Maple, and Maxima) has provided tools for working smoothly with n-dim arrays, and with that in place I find 1-based much more natural and less to think about. The first item is 1, the 2nd 2, the n-th n… in my mind the 0th thing in a collection is when you have an empty set so the thing doesn’t actually exist.

I guess if you’re counting counts then you might need the number of times 0 happened, but then you might want the number of times a thing happened in the years 1967-2021, then you’re out of luck either way (except that Julia handily gives you a solution to that).

Hi folks. Thanks for the input, but this has been endlessly discussed, obviously isn’t going to change at this point and there’s no purpose in reviving a five year old thread.

16 Likes