I have trying to learn kmean clustering using the Clustering package and the example code given on “ClusteringSolutions” on the iris data. After reading CSV file, when I attempt to initiate kmean clustering using the example code given, I get the following error.
iris=CSV.read("iris.csv");
executed in 1.94s, finished 09:00:08 2018-10-03
features = Array(iris[:,[1,3,4]])'
result = kmeans( features, 3 )
executed in 48ms, finished 09:01:20 2018-10-03
┌ Warning: indexing with colon as row will create a copy in the future use df[col_inds] to get the columns without copying
│ caller = top-level scope at In[5]:1
└ @ Core In[5]:1
MethodError: no method matching Array(::DataFrames.DataFrame)
Closest candidates are:
Array(!Matched::LinearAlgebra.SymTridiagonal) at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.0\LinearAlgebra\src\tridiag.jl:111
Array(!Matched::LinearAlgebra.Tridiagonal) at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.0\LinearAlgebra\src\tridiag.jl:518
Array(!Matched::LinearAlgebra.AbstractTriangular) at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.0\LinearAlgebra\src\triangular.jl:106
...
Stacktrace:
[1] top-level scope at In[5]:1
I understand the warning thrown by the process but the error thrown subsequetly is not understood. if I change the code a bit, I still get an error.
features=convert(Array{Float64},iris[1:4]);
features=features'
result = kmeans( features, 3 )
MethodError: no method matching kmeans(::LinearAlgebra.Adjoint{Float64,Array{Float64,2}}, ::Int64)
Closest candidates are:
kmeans(!Matched::Array{T<:AbstractFloat,2}, ::Int64; weights, init, maxiter, tol, display, distance) where T<:AbstractFloat at C:\Users\chatura\.julia\packages\Clustering\jd204\src\kmeans.jl:51
Stacktrace:
[1] top-level scope at In[13]:4
Finally when I modify the code as per instructions given in Clustering docs (K-means — Clustering 0.3.0 documentation), everything seem to work fine.
# features=convert(Array{Float64},iris[1:4]);
features=permutedims(convert(Array{Float64}, iris[1:4]), [2, 1])
result = kmeans(features, 4; maxiter=50, display=:iter)
Iters objv objv-change | affected
-------------------------------------------------------------
0 1.228900e+02
1 5.679215e+01 -6.609785e+01 | 3
2 5.591592e+01 -8.762242e-01 | 3
3 5.584415e+01 -7.177153e-02 | 2
4 5.580442e+01 -3.973175e-02 | 0
5 5.580442e+01 0.000000e+00 | 0
K-means converged with 5 iterations (objv = 55.80442051737124)
KmeansResult{Float64}([7.02692 5.01633 5.53214 6.27391; 3.1 3.44082 2.63571 2.88696; 5.94615 1.46735 3.96071 4.89348; 2.15 0.242857 1.22857 1.68043], [2, 2, 2, 2, 2, 2, 2, 2, 2, 2 … 1, 1, 4, 1, 1, 4, 4, 4, 4, 4], [0.0168763, 0.214223, 0.187897, 0.292387, 0.0319783, 0.436876, 0.182795, 0.00483549, 0.678713, 0.151162 … 0.289201, 0.754586, 0.350406, 0.0861243, 0.32997, 0.672146, 0.209972, 0.259972, 0.909102, 0.209537], [26, 49, 28, 46], [26.0, 49.0, 28.0, 46.0], 55.80442051737124, 5, true)
Although I am able to run the example, I look forward to understand the reasons for its failure in the first instance.