Is it possible use a Vector of CartesianIndex to select data on a matrix?

Hello, good afternoon.

Is it possible use a Vector of CartesianIndex to select data on a matrix as below? The issue, its giving me an undefined result: Matrix{Float64}(undef, 0, 1)

@show data[groupidx .== 3, :]

Your question needs more explanation. You don’t explain any of the variables, and the example appears to be using a BitArray, not a vector of CartesianIndex.

I am not clear about what the issue may be. Here’s an example that has a non-empty result.

julia> data = rand(0:9, 16,16)
16Ă—16 Matrix{Int64}:
 4  3  1  1  9  2  8  7  8  1  2  5  4  6  8  6
 6  7  4  5  3  9  4  3  3  9  9  4  9  6  2  3
 4  6  8  7  5  9  9  7  5  6  8  9  9  9  8  9
 9  7  5  7  0  5  7  1  6  2  2  8  6  3  5  3
 1  8  9  2  5  9  2  4  1  9  4  9  1  6  4  5
 7  4  9  6  9  0  4  4  6  0  1  0  4  1  2  3
 0  1  6  5  2  3  2  1  0  4  5  2  8  9  8  1
 1  3  1  0  0  2  3  6  2  7  6  9  6  9  6  2
 0  1  6  1  5  8  8  8  3  2  1  2  9  4  7  2
 1  7  6  1  4  3  0  0  7  5  3  7  7  8  7  1
 4  5  1  5  7  5  1  6  2  8  0  9  1  9  5  9
 2  7  7  6  6  3  0  7  5  6  3  0  5  7  0  1
 7  0  1  6  9  5  7  1  4  1  5  0  3  6  5  8
 2  6  8  6  8  1  5  8  0  8  5  3  0  4  6  3
 9  9  9  7  6  1  4  5  7  6  7  3  7  4  7  8
 4  9  4  9  8  4  6  2  3  8  8  3  6  4  7  3

julia> groupidx = rand(1:3, 16)
16-element Vector{Int64}:
 2
 3
 1
 1
 3
 2
 2
 3
 1
 1
 1
 3
 3
 3
 1
 3

julia> data[groupidx .==3, :]
7Ă—16 Matrix{Int64}:
 6  7  4  5  3  9  4  3  3  9  9  4  9  6  2  3
 1  8  9  2  5  9  2  4  1  9  4  9  1  6  4  5
 1  3  1  0  0  2  3  6  2  7  6  9  6  9  6  2
 2  7  7  6  6  3  0  7  5  6  3  0  5  7  0  1
 7  0  1  6  9  5  7  1  4  1  5  0  3  6  5  8
 2  6  8  6  8  1  5  8  0  8  5  3  0  4  6  3
 4  9  4  9  8  4  6  2  3  8  8  3  6  4  7  3

If there are no 3s in groupidx, then one would get.

julia> groupidx = rand(1:2, 16)
16-element Vector{Int64}:
 1
 1
 1
 1
 1
 1
 1
 1
 2
 1
 1
 1
 1
 2
 2
 1

julia> data[groupidx .==3, :]
0Ă—16 Matrix{Int64}

What else are you expecting?

its something like that, but groupidx is of type CartesianIndex

groupidx is CartesianIndex. Is it possible do this with this data type?

@show data[groupidx .== 3, :]

You have to convert the CartesianIndex to LinearIndex before comparing with raw integers:

julia> # assume size(data) == (3, 2)
       lin2cart(ind) = CartesianIndices((3, 2))[ind]
lin2cart (generic function with 1 method)

julia> cart2lin(ind) = LinearIndices((3,2))[ind]
cart2lin (generic function with 1 method)

julia> lin2cart(1)
CartesianIndex(1, 1)

julia> lin2cart(2)
CartesianIndex(2, 1)

julia> lin2cart(3)
CartesianIndex(3, 1)

julia> cart2lin(CartesianIndex(1,1))
1

julia> cart2lin(CartesianIndex(2,1))
2

julia> cart2lin(CartesianIndex(1,2))
4

Notice that a CartesianIndex is not iterable, but you can extract the tuple inside with CartesianIndex(1,1).I in case you need to pass (i,j) separately somewhere.

1 Like

The short answer: yes.

julia> data = rand(0:9, 16,16)
16Ă—16 Matrix{Int64}:
 6  3  9  2  4  9  0  8  6  5  7  8  9  9  1  3
 1  4  7  9  4  6  8  2  4  8  6  5  1  3  7  2
 7  1  9  3  1  6  7  3  8  3  5  5  6  9  0  3
 2  5  6  5  6  0  0  4  3  4  8  7  5  1  1  8
 1  5  4  3  9  2  6  1  0  7  9  5  2  4  0  8
 â‹®              â‹®              â‹®              â‹®
 5  5  1  2  0  0  3  3  7  3  0  3  0  8  1  9
 8  3  2  6  5  4  3  0  2  2  5  1  8  3  3  2
 3  1  4  0  2  8  4  9  1  5  9  8  6  1  8  6
 3  1  2  9  2  1  5  0  3  7  4  6  5  8  8  4
 3  0  1  4  4  2  5  6  8  1  7  8  8  1  6  3

julia> vci=rand(CartesianIndices((16,16)),7)
7-element Vector{CartesianIndex{2}}:
 CartesianIndex(10, 14)
 CartesianIndex(7, 10)
 CartesianIndex(5, 9)
 CartesianIndex(3, 8)
 CartesianIndex(6, 8)
 CartesianIndex(11, 14)
 CartesianIndex(1, 8)

julia> data[vci]
7-element Vector{Int64}:
 7
 1
 0
 3
 4
 5
 8

if you want them sorted


julia> sort(vci, by=x->x.I)
7-element Vector{CartesianIndex{2}}:
 CartesianIndex(1, 8)
 CartesianIndex(3, 8)
 CartesianIndex(5, 9)
 CartesianIndex(6, 8)
 CartesianIndex(7, 10)
 CartesianIndex(10, 14)
 CartesianIndex(11, 14)

# or

julia> sort(vci, by=Tuple)
7-element Vector{CartesianIndex{2}}:
 CartesianIndex(1, 8)
 CartesianIndex(3, 8)
 CartesianIndex(5, 9)
 CartesianIndex(6, 8)
 CartesianIndex(7, 10)
 CartesianIndex(10, 14)
 CartesianIndex(11, 14)

to be exactly, I was wanted to use the CartesianIndex to do boolean selection on a matrix… Its an implementation of kmeans algorithm I’m doing.

# got the min distances, what gives me a CartesianIndex
groupidx = argmin(distances, dims=2)

# select the min distances from clusters
for ki in 1:3
    centroids[ki, :] = [mean(data[groupidx .== ki], 1), mean(data[groupidx .== ki], 2)]
end

Its hard to found basic machine learning content with julia like in books, then I need to port my python examples to julia to learn, but numpy and the language has differences from julia. argmin just outputs a plain numpy array with min indexes.

In this case, linear algebra content.

You can find some examples online to remove old habits from other languages:

For example, the “numpy” way often works with arrays all over the place. The Julia way avoid allocating arrays, and uses more for loops explicitly as often described in pseudocode in textbooks.

1 Like

You might want to broadcast argmin over the rows or columns which just givey ou a number.

julia> function get_distance_matrix(A,B)
           map(Iterators.product(A,B)) do (a,b)
               norm(a-b)
           end
       end
get_distance_matrix (generic function with 2 methods)

julia> distances = get_distance_matrix(eachrow(data), eachrow(rand(0:9, 3, 3)))
16Ă—3 Matrix{Float64}:
  1.41421   7.87401  5.74456
  ...

julia> argmin.(eachrow(distances))
16-element Vector{Int64}:
 1
 3
 2
 3
 3
 1
 1
 3
 1
 3
 2
 3
 1
 3
 3
 1

I tried to answer what was asked in the title.
I can’t (and I don’t seem to be the only one) to figure out what the problem is.
Can you give a minimal but complete example of what you are looking for?
Giving some basic examples of what is distances, groupindex and what is data?

Here is an example making your approach work. I should note that there are efficiency issues here.

julia> using Statistics, LinearAlgebra
                                                                  
julia> data = rand(0.:9., 5, 3)
5Ă—3 Matrix{Float64}:                                               
 0.0  3.0  1.0
 2.0  3.0  5.0
 9.0  4.0  3.0
 4.0  5.0  9.0
 7.0  8.0  1.0

julia> centroids = rand(0.:9., k, 3)
3Ă—3 Matrix{Float64}:
 4.0  3.0  9.0                                                     
 0.0  1.0  4.0
 9.0  7.0  2.0

julia> function get_distance_matrix(A,B)
                  map(Iterators.product(A,B)) do (a,b)
                      norm(a-b)
                  end
              end
get_distance_matrix (generic function with 1 method)

julia> distances = get_distance_matrix(eachrow(data), eachrow(centroids))
5Ă—3 Matrix{Float64}:
 8.94427   3.60555  9.89949
 4.47214   3.0      8.60233
 7.87401   9.53939  3.16228
 2.0       7.54983  8.83176
 9.89949  10.3441   2.44949

julia> groupidx = argmin.(eachrow(distances))
5-element Vector{Int64}:
 2
 2
 3
 1
 3

julia> for ki in 1:k
           centroids[ki, :] = mean(data[groupidx .== ki, :], dims=1)
       end

julia> centroids
3Ă—3 Matrix{Float64}:
 4.0  5.0  9.0
 1.0  3.0  3.0
 8.0  6.0  2.0

julia> distances = get_distance_matrix(eachrow(data), eachrow(centroids))
5Ă—3 Matrix{Float64}:
 9.16515  2.23607  8.60233
 4.89898  2.23607  7.34847
 7.87401  8.06226  2.44949
 0.0      7.0      8.12404
 9.05539  8.06226  2.44949

julia> groupidx = argmin.(eachrow(distances))
5-element Vector{Int64}:
 2
 2
 3
 1
 3

julia> for ki in 1:k
           centroids[ki, :] = mean(data[groupidx .== ki, :], dims=1)
       end

julia> centroids
3Ă—3 Matrix{Float64}:
 4.0  5.0  9.0
 1.0  3.0  3.0
 8.0  6.0  2.0
1 Like

You may be looking for broadcasted getindex to pull an index from the vector of CartesianIndices. Is this what you are looking for:

julia> M = rand(5,5)
5Ă—5 Matrix{Float64}:
 0.485589  0.982155  0.0501349  0.235755    0.691366
 0.830701  0.66042   0.450443   0.690616    0.281628
 0.979683  0.849866  0.724607   0.0028576   0.551634
 0.337247  0.899001  0.10619    0.767832    0.199422
 0.225012  0.925256  0.807346   0.00346096  0.316832

julia> groupidx = getindex.(argmin(M; dims=2),2)
5Ă—1 Matrix{Int64}:
 3
 5
 4
 3
 4

julia> M[groupidx .== 3, :]
2Ă—1 Matrix{Float64}:
 0.48558867131443617
 0.33724703282163426
1 Like

Yep, all above solutions worked well. Thank you guys so much for the help. Very nice approach to get the indexes of the cartesianindex.

1 Like

Here is the result :slight_smile:

3 Likes