Multi column indexing

jpmorr · August 12, 2021, 12:29pm

I’m still new to Julia but there’s still some basics that I’m still clearly not understanding. Take the following example which has confused me all morning.

storage_indx_mult = hcat([12, 165, 55, 66, 89, 101, 2], [68, 135, 409, 222,6, 818,7])
column_index_mult = [1,1,2,2,1,1,2]
id_index = [6,4,5,2,1,7,3]
test_ids_mult = storage_indx_mult[id_index, column_index_mult]

In this case I’m expecting to get back a single vector:

test_ids_mult_expected = [101, 66, 6, 135, 12, 2, 409]

But Julia gives back a full matrix:

test_ids_mult = storage_indx_mult[id_index, column_index_mult]
7×7 Matrix{Int64}:
101 101 818 818 101 101 818
66 66 222 222 66 66 222
89 89 6 6 89 89 6
165 165 135 135 165 165 135
12 12 66 66 12 12 66
2 2 7 7 2 2 7
55 55 409 409 55 55 409

As I said, I’m clearly not understanding something about Julia and Indexing. I think coming from a python background is confusing my thinking. How do I extract the single vector I want from the two index vectors?

rafael.guerra · August 12, 2021, 12:47pm

You can try this:

getindex.(Ref(storage_indx_mult), id_index, column_index_mult)

jpmorr · August 12, 2021, 1:19pm

Thanks. I would never have figured this out. Seems rather unintuitive compared to what I’m used to in python.

jipolanco · August 12, 2021, 1:28pm

This alternative may seem more intuitive (and closer to python):

[storage_indx_mult[i, j] for (i, j) in zip(id_index, column_index_mult)]

Note that, in Julia, your storage_indx_mult[id_index, column_index_mult] is actually equivalent to [storage_indx_mult[i, j] for i in id_index, j in column_index_mult].

Yet another option:

storage_indx_mult[CartesianIndex.(id_index, column_index_mult)]

jpmorr · August 12, 2021, 1:41pm

Thanks for these other options. The cartesian index seems the most straightforward as list comprehensions are always easy to mess up I find!

I did a quick check with BencmarkTools to check perfromance and for a typical dataset I would be working on with about 100K rows, getindex seems to be the quickest.

@btime pIDs[CartesianIndex.(ids[:,1], cc2[:, 1].+1)]
216.600 μs (14 allocations: 1.78 MiB)
@btime [pIDs[i, j] for (i, j) in zip(ids[:,1], cc2[:, 1].+1)]
1.886 ms (99894 allocations: 2.91 MiB)
@btime getindex.(Ref(pIDs), ids[:,1], cc2[:, 1].+1)
107.900 μs (13 allocations: 1010.94 KiB)

Topic		Replies	Views
Select specific elements from each column of a matrix in julia New to Julia indexing , matrices	4	100	March 19, 2025
Elegant way to do multi-index algebra General Usage question	4	356	February 10, 2023
Is it possible to index into a set of columns of a 3D array in a single line? New to Julia indexing , matrices	16	1188	September 3, 2022
Vector{Vector} indices General Usage indexing , arrays	22	2810	September 19, 2022
How to index array at multiple locations New to Julia indexing	2	392	July 13, 2022

Multi column indexing

Related topics