Select specific elements from each column of a matrix in julia

Let’s say I have this 16x1000 matrix:

test_mat = rand(16, 1000)

And I have these indices:

test_indices = rand(1:16, 4, 1000)

I want to extract from test_mat a 4x1000 matrix where the indices of the row are represented by each column of test_indices (for example, the first column should be test_mat[test_indices[:, 1], 1] ).

I can do this with a for loop easily but is there a more clever way using some more intelligent indexing? I cannot figure it out.

I won’t call it pretty, but you could use linear indexing

# you can overwrite test_indices with lin_indices if you want to save an allocation
lin_indices = test_indices .+ (0:size(test_indices,2)-1)' .* size(test_mat,1)
test_mat[lin_indices]

Arguably more elegant is CartesianIndex

cart_indices = CartesianIndex.(test_indices, axes(test_indices,2)')
test_mat[cart_indices]

But I would probably go with a comprehension like

[test_mat[test_indices[i,j],j] for i in axes(test_indices,1), j in axes(test_indices,2)]

or a simple for loop that fills an array.

4 Likes

amazing, thank you! why do you think the for loop is the best option? Is it more readable to you?

Any of the above are fine (as is a for loop). There may be performance differences among them, and some are easier to read than others. But unless you find that the performance difference of this operation is significant (it probably isn’t, in the grand scheme of your computation) I would go for whatever seems most readable to you.

The nice thing about for loops is that they have a comparatively low potential for surprise performance pitfalls and are sometimes easier to understand than more elaborate constructs.

In this case, I would probably use the comprehension. But if you don’t need test_indices for anything else, you might think about computing them on a per-column basis and indexing test_mat then and there, rather than making a big matrix for the indices and then separately indexing the matrix.

1 Like

Another way is to slice the matrix and the indices, and stack the result. Not quite as efficient as the indexing way (although I’m not quite sure why) but quite a literal translation of your description:

julia> res1 = [test_mat[test_indices[i,j],j] for i in axes(test_indices,1), j in axes(test_indices,2)]; # as above

julia> res2 = stack(eachcol(test_mat), eachcol(test_indices)) do col, ind
         view(col, ind)
       end;

julia> res1 == res2
true