DimensionMismatch: The size of a distance matrix ((1, 440)) doesn't match the length of assignment vector (440)

How to fix this:

using Clustering
using VectorizedStatistics

function MyKmeans(Train::Vector{Float64}, k::Int64)
    the_mat = Train'
    model = kmeans(the_mat, k)

    return vmean(silhouettes(assignments(model), counts(model), the_mat))
end
julia> MyKmeans(rand(440), 2)
ERROR: DimensionMismatch: The size of a distance matrix ((1, 440)) doesn't match the length of assignment vector 
(440).
Stacktrace:
 [1] silhouettes(assignments::Vector{Int64}, counts::Vector{Int64}, dists::Adjoint{Float64, Vector{Float64}})    
   @ Clustering C:\Users\Julia\.julia\packages\Clustering\eBCMN\src\silhouette.jl:55
 [2] MyKmeans(Train::Vector{Float64}, k::Int64, kind::Silhouette)
   @ Main e:\newapproach.jl:155
 [3] top-level scope
   @ REPL[33]:1
1 Like

the_mat needs to be a distance matrix. So, it can be initialized as:

    the_mat = abs.(Train .- Train')

and then the function returns without an error.

Another option for getting the same distance matrix (in a more forward looking programming style):

    the_mat = pairwise(Euclidean(), Train)
1 Like

Thanks,
Should I give the result of abs.(Train .- Train') to kmeans function rather than Train'? Or just in case of silhouettes?

Just silhouettes function, kmeans likes Train'.
BTW, if you are new you might have missed out on the easy ‘help’ feature of the REPL: Just pressing ? and a function gives useful info for reputable packages.

1 Like