PSA: replacement of ind2sub/sub2ind in Julia 0.7+


#1

I’ve now seen a couple of cases where folks seem unhappy about the removal of the functions ind2sub and sub2ind. Some of the concerns may stem from the suggested replacement in the deprecation warning, as the replacement, while “safe,” seems considerably more awkward. This is a short post explaining their replacements and the motivation for the change; once you learn to think in the new way, I’d be surprised if you don’t become a fan of the new approach.

In older versions of Julia, the way to convert between a linear index and a cartesian index was to use the conversion functions ind2sub and sub2ind. In Julia 0.7 these were replaced by objects created by CartesianIndices and LinearIndices; for example,

function oldi2s(a, i)
    sz = size(a)
    ind2sub(sz, i)
end

should now be written

function newi2s(a, i)
    i2s = CartesianIndices(a)
    i2s[i]
end

There are two concerns I’ve heard. I’ll address them both.

  • That’s a lot of characters. Why the more verbose version? It’s only more verbose if you use i2s only once; if you do conversions in multiple places in the same function, the new syntax is considerably shorter. But more importantly, the old API has a poor design because it suffers from “must read the docs” syndrome: if you’re not using ind2sub and sub2ind every day, then I’ll wager that you have to remind yourself whether it’s ind2sub(sz, i) or ind2sub(i, sz)—it’s pretty much an arbitrary choice, and as a consequence infrequent users routinely have to check the docs every time they use them.
    The new API solves that uncertainty by splitting it into two steps: (a) creation of the conversion object (i2s above) depends only on the array, and (2) calculation of the index depends only on the index. Since each operation stands alone, there’s no order confusion.
    The final nice thing about the new API is that we already had CartesianIndices (it used to be called CartesianRange) for iterating over arrays; all we had to do was endow it with new properties, at which point it subsumed the purpose of ind2sub. There seems little reason to keep an entirely redundant function.
  • Isn’t it slow to create these objects? No, it’s not. Here’s a demo:
julia> function newi2s(a, i)
           i2s = CartesianIndices(a)
           i2s[i]
       end
newi2s (generic function with 1 method)

julia> a = rand(5, 7);

julia> using BenchmarkTools

julia> newi2s(a, 12)
CartesianIndex(2, 3)

julia> @btime newi2s($a, 12)
  6.379 ns (0 allocations: 0 bytes)
CartesianIndex(2, 3)

Compare 0.6:

julia> function oldi2s(a, i)
           sz = size(a)
           ind2sub(sz, i)
       end
oldi2s (generic function with 1 method)

julia> @btime oldi2s($a, 12)
  15.065 ns (0 allocations: 0 bytes)
(2, 3)

So Julia 1.0 still has no allocations, and is 2x faster than Julia-0.6. If you know the access is in-bounds you can make it even faster by changing the relevant line to @inbounds i2s[i].

In summary, the new API is memorable and efficient, and consequently recommended for general use.


Should we have a Pkg-Announcements channel?
#2

Thank you Tim for clarifying the issue.

I understand that the object i2s = CartesianIndices(sz) is lazy and does not allocate memory. So when we do i2s[i] we only allocate a single tuple, correct?


#3

That is right. And note that the tuple is NOT heap allocated, which is the “bad” kind of allocation, performance wise.


#4

This is going to make me sound like a moron, but honestly another thing I hated is that I could never remember what ind or sub was in ind2sub and I constantly got confused about which of the functions to use.

Now I’m absolutely certain that CartesianIndices is going to give me cartesian indices when I give linear indices as an argument.


#5

Tim, thank you. Not just for this post, but in general. Whenever I see a new package, blog post, or talk from you it’s just great and makes sense immediately. The Julia community can be lucky to have you!


#6

Slightly off topic: How would one now write the boxcar3 example in your blog post https://julialang.org/blog/2016/02/iteration? Over in slack it was suggested by Simon Schoelly to replace
for J in CartesianRange(max(I1, I-I1), min(Iend, I+I1)) with
for J in CartesianIndices(UnitRange.( max(I1, I-I1).I , min(Iend, I+I1).I )). Works but seems a bit ugly, maybe there is a better way?

(Also, maybe that blog post could/should be updated to 1.0 as it is linked from the docs?)


What is the proper way to construct CartesianIndices from CartesianIndex in Julia 1.0?
#7

+1 for this, I miss CartesianRange(CartesianIndex, CartensianIndex) a lot.


#8

Here is a way to avoid the second CartesianRange creation:

function boxcar3(A::AbstractArray)
    out = similar(A)
    R = CartesianIndices(A)
    I1, Iend = first(R), last(R)
    for I in R
        n, s = 0, zero(eltype(out))
        for J in R[UnitRange.(max(I1, I-I1).I,min(Iend, I+I1).I)...]
            s += A[J]
            n += 1
        end
        out[I] = s/n
    end
    out
end

It would be more natural if the call to UnitRange could just be replaced with a :, I guess.


#9

It may be worth demonstrating the other direction as well:

julia> function news2i(a,i...)
           s2i = LinearIndices(a)
           s2i[i...]
       end
news2i (generic function with 1 method)

julia> a = rand(5,7);

julia> news2i(a, 2,3)
12

Would it be possible / are there plans to make LinearIndices return a OrdinalRange when indexed with one? For example, we currently have

julia> news2i(a, :,1)
5-element Array{Int64,1}:
 1
 2
 3
 4
 5

when it could be news2i(a, :, 1) -> 1:5 instead.


#10

Added in #29440 by Tim Holy!


#11

I struggled with this change recently and couldn’t figure out how to make it work in the next syntax for my use case.

Here is what the code looked like on v0.6:

data = randn(1000,100,10)
bigValues = data .> 2   # Some criteria
idx = find(bigValues)   # Get list matching criteria

(x, y, z) = ind2sub(size(data), idx)  # Convert to tuple of vector, one for each dimension

mask = ones(true, length(x))   
mask[x <. 20] = false                        # Apply some filter on individual dimensions
mask[y <. 2] = false                        # Apply some filter on individual dimensions

idx = idx[mask]  # Drop instances found on the first 19 indexes in the x dimension and on first index of y dimension

The migration tip given in the deprecation warnings only worked on a scalar ind and when I played around with variations, the best I managed was to get an array of tuples.

I finally reverted to creating the below function, which just feels like I needed to go much more primitive than what should be neccesary. (Thank goodness for fast for loops). Is there a better suggestion?

splitCartesian(mat :: Array, idx) = splitCartesian(size(mat), idx)
function splitCartesian(sz :: Tuple, idx)
   nD = length(sz)
   nidx = length(idx) 
   subs = zeros(Int, nidx, nD)
   cart = CartesianIndices(sz)[idx]
   for cnt=1:nidx
      for cnt2=1:nD
         subs[cnt, cnt2] = cart[cnt][cnt2] 
      end
   end
   T = ()
   for cnt2=1:nD
      T = (T..., @view(subs[:,cnt2]))
   end
   T
end

#12

I think you’ll be pleased to realize that in Julia 1.0 your entire script can be written like this:

data = randn(1000,100,10)
bigValues = data .> 2
idx = filter(ci->ci[1] >= 20 && ci[2] >= 2, findall(bigValues))

Don’t you think that’s much clearer than all the indexing shenanigans?


#13

Thanks, that does seem more compact for this part, but at some point still need to get the individual indexes. For example the indexes corresponds to a location in real-world coordinates for example (Lat, Long, elev), thus I would like the data as 3 vectors. I went with the approach of a struct of vectors as interface to later parts of the code to be able to easily call scatter plot etc… I guess I could use a comprehension

[idx[c][d] for c in 1:length(idx), d=1:3]

and use slices from that, or 3 separate comprehensions, although that creates copies of essentially the same information already in memory). I guess the point is that switching between tuple of vectors (old format) and vectors of tuples (new format) has knock on effects, unless there are other tricks I’m missing, I cannot directly do idx[:][2].


#14

This is the old Structure of Arrays (SoA) vs Array of Structure (AoS) issue. In addition to the solutions you’re describing, you can do

coords = reshape(reinterpret(Int, idx), length(first(idx)), length(idx))

which is like your comprehension solution but doesn’t make a copy of the data. It also shouldn’t be hard to create a function that “scatters” the elements of idx to a tuple-of-vectors. EDIT: in fact, here it is:

function scatterindices(idx::AbstractVector{CartesianIndex{N}}) where N
    out = ntuple(d->similar(idx, Int, axes(idx)), Val(N))
    for i in eachindex(idx)
        for d = 1:N
            out[d][i] = idx[i][d]
        end
    end
    return out
end

I cannot directly do idx[:][2]

You can do (i->i[2]).(idx).

But is your central concern that the deprecation warning doesn’t help with the vector case? Perhaps we’re missing a method?


#15

This might be very simple, but I’m editing my code that works with Julia 0.6 right now, so that I can use it with Julia 0.7+. I still don’t understand one thing about the CartesianIndices() function.

In this example:

If I wanted to assign the values of 2 and 3 from Cartesianindex(2,3) to two different variables, is there a simple way of doing that?


#16

x, y = Tuple(i)


#17

Great. Thanks!


#18

The blog post has now been updated for master/Julia 1.1.