Hello, good afternoon.
Is it possible use a Vector of CartesianIndex to select data on a matrix as below? The issue, its giving me an undefined result: Matrix{Float64}(undef, 0, 1)
@show data[groupidx .== 3, :]
Hello, good afternoon.
Is it possible use a Vector of CartesianIndex to select data on a matrix as below? The issue, its giving me an undefined result: Matrix{Float64}(undef, 0, 1)
@show data[groupidx .== 3, :]
Your question needs more explanation. You don’t explain any of the variables, and the example appears to be using a BitArray
, not a vector of CartesianIndex
.
I am not clear about what the issue may be. Here’s an example that has a non-empty result.
julia> data = rand(0:9, 16,16)
16Ă—16 Matrix{Int64}:
4 3 1 1 9 2 8 7 8 1 2 5 4 6 8 6
6 7 4 5 3 9 4 3 3 9 9 4 9 6 2 3
4 6 8 7 5 9 9 7 5 6 8 9 9 9 8 9
9 7 5 7 0 5 7 1 6 2 2 8 6 3 5 3
1 8 9 2 5 9 2 4 1 9 4 9 1 6 4 5
7 4 9 6 9 0 4 4 6 0 1 0 4 1 2 3
0 1 6 5 2 3 2 1 0 4 5 2 8 9 8 1
1 3 1 0 0 2 3 6 2 7 6 9 6 9 6 2
0 1 6 1 5 8 8 8 3 2 1 2 9 4 7 2
1 7 6 1 4 3 0 0 7 5 3 7 7 8 7 1
4 5 1 5 7 5 1 6 2 8 0 9 1 9 5 9
2 7 7 6 6 3 0 7 5 6 3 0 5 7 0 1
7 0 1 6 9 5 7 1 4 1 5 0 3 6 5 8
2 6 8 6 8 1 5 8 0 8 5 3 0 4 6 3
9 9 9 7 6 1 4 5 7 6 7 3 7 4 7 8
4 9 4 9 8 4 6 2 3 8 8 3 6 4 7 3
julia> groupidx = rand(1:3, 16)
16-element Vector{Int64}:
2
3
1
1
3
2
2
3
1
1
1
3
3
3
1
3
julia> data[groupidx .==3, :]
7Ă—16 Matrix{Int64}:
6 7 4 5 3 9 4 3 3 9 9 4 9 6 2 3
1 8 9 2 5 9 2 4 1 9 4 9 1 6 4 5
1 3 1 0 0 2 3 6 2 7 6 9 6 9 6 2
2 7 7 6 6 3 0 7 5 6 3 0 5 7 0 1
7 0 1 6 9 5 7 1 4 1 5 0 3 6 5 8
2 6 8 6 8 1 5 8 0 8 5 3 0 4 6 3
4 9 4 9 8 4 6 2 3 8 8 3 6 4 7 3
If there are no 3
s in groupidx
, then one would get.
julia> groupidx = rand(1:2, 16)
16-element Vector{Int64}:
1
1
1
1
1
1
1
1
2
1
1
1
1
2
2
1
julia> data[groupidx .==3, :]
0Ă—16 Matrix{Int64}
What else are you expecting?
its something like that, but groupidx is of type CartesianIndex
groupidx is CartesianIndex. Is it possible do this with this data type?
@show data[groupidx .== 3, :]
You have to convert the CartesianIndex to LinearIndex before comparing with raw integers:
julia> # assume size(data) == (3, 2)
lin2cart(ind) = CartesianIndices((3, 2))[ind]
lin2cart (generic function with 1 method)
julia> cart2lin(ind) = LinearIndices((3,2))[ind]
cart2lin (generic function with 1 method)
julia> lin2cart(1)
CartesianIndex(1, 1)
julia> lin2cart(2)
CartesianIndex(2, 1)
julia> lin2cart(3)
CartesianIndex(3, 1)
julia> cart2lin(CartesianIndex(1,1))
1
julia> cart2lin(CartesianIndex(2,1))
2
julia> cart2lin(CartesianIndex(1,2))
4
Notice that a CartesianIndex is not iterable, but you can extract the tuple inside with CartesianIndex(1,1).I
in case you need to pass (i,j) separately somewhere.
The short answer: yes.
julia> data = rand(0:9, 16,16)
16Ă—16 Matrix{Int64}:
6 3 9 2 4 9 0 8 6 5 7 8 9 9 1 3
1 4 7 9 4 6 8 2 4 8 6 5 1 3 7 2
7 1 9 3 1 6 7 3 8 3 5 5 6 9 0 3
2 5 6 5 6 0 0 4 3 4 8 7 5 1 1 8
1 5 4 3 9 2 6 1 0 7 9 5 2 4 0 8
â‹® â‹® â‹® â‹®
5 5 1 2 0 0 3 3 7 3 0 3 0 8 1 9
8 3 2 6 5 4 3 0 2 2 5 1 8 3 3 2
3 1 4 0 2 8 4 9 1 5 9 8 6 1 8 6
3 1 2 9 2 1 5 0 3 7 4 6 5 8 8 4
3 0 1 4 4 2 5 6 8 1 7 8 8 1 6 3
julia> vci=rand(CartesianIndices((16,16)),7)
7-element Vector{CartesianIndex{2}}:
CartesianIndex(10, 14)
CartesianIndex(7, 10)
CartesianIndex(5, 9)
CartesianIndex(3, 8)
CartesianIndex(6, 8)
CartesianIndex(11, 14)
CartesianIndex(1, 8)
julia> data[vci]
7-element Vector{Int64}:
7
1
0
3
4
5
8
if you want them sorted
julia> sort(vci, by=x->x.I)
7-element Vector{CartesianIndex{2}}:
CartesianIndex(1, 8)
CartesianIndex(3, 8)
CartesianIndex(5, 9)
CartesianIndex(6, 8)
CartesianIndex(7, 10)
CartesianIndex(10, 14)
CartesianIndex(11, 14)
# or
julia> sort(vci, by=Tuple)
7-element Vector{CartesianIndex{2}}:
CartesianIndex(1, 8)
CartesianIndex(3, 8)
CartesianIndex(5, 9)
CartesianIndex(6, 8)
CartesianIndex(7, 10)
CartesianIndex(10, 14)
CartesianIndex(11, 14)
to be exactly, I was wanted to use the CartesianIndex to do boolean selection on a matrix… Its an implementation of kmeans algorithm I’m doing.
# got the min distances, what gives me a CartesianIndex
groupidx = argmin(distances, dims=2)
# select the min distances from clusters
for ki in 1:3
centroids[ki, :] = [mean(data[groupidx .== ki], 1), mean(data[groupidx .== ki], 2)]
end
Its hard to found basic machine learning content with julia like in books, then I need to port my python examples to julia to learn, but numpy and the language has differences from julia. argmin just outputs a plain numpy array with min indexes.
In this case, linear algebra content.
You can find some examples online to remove old habits from other languages:
For example, the “numpy” way often works with arrays all over the place. The Julia way avoid allocating arrays, and uses more for loops explicitly as often described in pseudocode in textbooks.
You might want to broadcast argmin
over the rows or columns which just givey ou a number.
julia> function get_distance_matrix(A,B)
map(Iterators.product(A,B)) do (a,b)
norm(a-b)
end
end
get_distance_matrix (generic function with 2 methods)
julia> distances = get_distance_matrix(eachrow(data), eachrow(rand(0:9, 3, 3)))
16Ă—3 Matrix{Float64}:
1.41421 7.87401 5.74456
...
julia> argmin.(eachrow(distances))
16-element Vector{Int64}:
1
3
2
3
3
1
1
3
1
3
2
3
1
3
3
1
I tried to answer what was asked in the title.
I can’t (and I don’t seem to be the only one) to figure out what the problem is.
Can you give a minimal but complete example of what you are looking for?
Giving some basic examples of what is distances, groupindex and what is data?
Here is an example making your approach work. I should note that there are efficiency issues here.
julia> using Statistics, LinearAlgebra
julia> data = rand(0.:9., 5, 3)
5Ă—3 Matrix{Float64}:
0.0 3.0 1.0
2.0 3.0 5.0
9.0 4.0 3.0
4.0 5.0 9.0
7.0 8.0 1.0
julia> centroids = rand(0.:9., k, 3)
3Ă—3 Matrix{Float64}:
4.0 3.0 9.0
0.0 1.0 4.0
9.0 7.0 2.0
julia> function get_distance_matrix(A,B)
map(Iterators.product(A,B)) do (a,b)
norm(a-b)
end
end
get_distance_matrix (generic function with 1 method)
julia> distances = get_distance_matrix(eachrow(data), eachrow(centroids))
5Ă—3 Matrix{Float64}:
8.94427 3.60555 9.89949
4.47214 3.0 8.60233
7.87401 9.53939 3.16228
2.0 7.54983 8.83176
9.89949 10.3441 2.44949
julia> groupidx = argmin.(eachrow(distances))
5-element Vector{Int64}:
2
2
3
1
3
julia> for ki in 1:k
centroids[ki, :] = mean(data[groupidx .== ki, :], dims=1)
end
julia> centroids
3Ă—3 Matrix{Float64}:
4.0 5.0 9.0
1.0 3.0 3.0
8.0 6.0 2.0
julia> distances = get_distance_matrix(eachrow(data), eachrow(centroids))
5Ă—3 Matrix{Float64}:
9.16515 2.23607 8.60233
4.89898 2.23607 7.34847
7.87401 8.06226 2.44949
0.0 7.0 8.12404
9.05539 8.06226 2.44949
julia> groupidx = argmin.(eachrow(distances))
5-element Vector{Int64}:
2
2
3
1
3
julia> for ki in 1:k
centroids[ki, :] = mean(data[groupidx .== ki, :], dims=1)
end
julia> centroids
3Ă—3 Matrix{Float64}:
4.0 5.0 9.0
1.0 3.0 3.0
8.0 6.0 2.0
You may be looking for broadcasted getindex
to pull an index from the vector of CartesianIndices. Is this what you are looking for:
julia> M = rand(5,5)
5Ă—5 Matrix{Float64}:
0.485589 0.982155 0.0501349 0.235755 0.691366
0.830701 0.66042 0.450443 0.690616 0.281628
0.979683 0.849866 0.724607 0.0028576 0.551634
0.337247 0.899001 0.10619 0.767832 0.199422
0.225012 0.925256 0.807346 0.00346096 0.316832
julia> groupidx = getindex.(argmin(M; dims=2),2)
5Ă—1 Matrix{Int64}:
3
5
4
3
4
julia> M[groupidx .== 3, :]
2Ă—1 Matrix{Float64}:
0.48558867131443617
0.33724703282163426
Yep, all above solutions worked well. Thank you guys so much for the help. Very nice approach to get the indexes of the cartesianindex.