A question about Distances.jl

this is my code:

using Distances

dm=pairwise(dist, X1,  dims=2)

LoadError: UndefVarError: pairwise not defined
Stacktrace:
[1] _kmeans!(X::Matrix{Float64}, weights::Nothing, centers::Matrix{Float64}, maxiter::Int64, tol::Float64, displevel::Int64, distance::Euclidean)
@ Clustering C:\Users\dell.julia\packages\Clustering\tt9vc\src\kmeans.jl:138
[2] kmeans!(X::Matrix{Float64}, centers::Matrix{Float64}; weights::Nothing, maxiter::Int64, tol::Float64, display::Symbol, distance::Euclidean)
@ Clustering C:\Users\dell.julia\packages\Clustering\tt9vc\src\kmeans.jl:70
[3] kmeans(X::Matrix{Float64}, k::Int64; weights::Nothing, init::Symbol, maxiter::Int64, tol::Float64, display::Symbol, distance::Euclidean)
@ Clustering C:\Users\dell.julia\packages\Clustering\tt9vc\src\kmeans.jl:112
[4] top-level scope
@ C:\Users\dell\Desktop\工程2\pca.jl:33
[5] eval
@ .\boot.jl:360 [inlined]
[6] include_string(mapexpr::typeof(identity), mod::Module, code::String, filename::String)
@ Base .\loading.jl:1116
in expression starting at C:\Users\dell\Desktop\工程2\pca.jl:30
WARNING: both Distances and StatsBase export “pairwise”; uses of it in module Main must be qualified
LoadError: UndefVarError: pairwise not defined
Stacktrace:
[1] _kmeans!(X::Matrix{Float64}, weights::Nothing, centers::Matrix{Float64}, maxiter::Int64, tol::Float64, displevel::Int64, distance::Euclidean)
@ Clustering C:\Users\dell.julia\packages\Clustering\tt9vc\src\kmeans.jl:138
[2] kmeans!(X::Matrix{Float64}, centers::Matrix{Float64}; weights::Nothing, maxiter::Int64, tol::Float64, display::Symbol, distance::Euclidean)
@ Clustering C:\Users\dell.julia\packages\Clustering\tt9vc\src\kmeans.jl:70
[3] kmeans(X::Matrix{Float64}, k::Int64; weights::Nothing, init::Symbol, maxiter::Int64, tol::Float64, display::Symbol, distance::Euclidean)
@ Clustering C:\Users\dell.julia\packages\Clustering\tt9vc\src\kmeans.jl:112
[4] top-level scope
@ C:\Users\dell\Desktop\工程2\pca.jl:33
[5] eval
@ .\boot.jl:360 [inlined]
[6] include_string(mapexpr::typeof(identity), mod::Module, code::String, filename::String)
@ Base .\loading.jl:1116

why is that?thank you !!!

the answer lies in this warning.
Changing your line to this should fix it:

dm=Distances.pairwise(dist, X1, dims=2)

1 Like

Uploading: image.png…

image

it is not work.

Could you please read through the recommendations here, and provide some more information based on the guidelines detailed there?
That would make it easier to help you.

My wild guess is that in line 30 of your file there’s just another call to pairwise that needs to be qualified (by changing pairwisetoDistances.pairwise`)

julia> using Distances

julia> dist = Distances.Euclidean()
Euclidean(0.0)

julia> X1 = rand(10, 10)
10×10 Matrix{Float64}:
 0.985882   0.036319    0.393556  …  0.0793244  0.310617     0.953439
 0.255112   0.435055    0.383491     0.112887   0.405229     0.722825
 0.563211   0.695017    0.634384     0.124678   0.507086     0.0131361
 0.0359783  0.00275204  0.506739     0.568531   0.145242     0.127146
 0.67697    0.989422    0.5165       0.624573   0.198026     0.32543
 0.182853   0.88098     0.413763  …  0.754935   0.640614     0.174002
 0.201681   0.447935    0.172959     0.745043   0.942729     0.127134
 0.160132   0.661322    0.414715     0.872607   0.000583358  0.430706
 0.637737   0.906884    0.498104     0.931592   0.0173489    0.857072
 0.781744   0.224723    0.270592     0.0443342  0.84014      0.192597

julia> dm = pairwise(dist, X1, dims=2)
10×10 Matrix{Float64}:
 0.0      1.49401   1.0097    1.57416  …  1.75528   1.37675  1.06223
 1.49401  0.0       1.06561   1.17709     1.03252   1.63317  1.58261
 1.0097   1.06561   0.0       1.11434     1.15746   1.27542  1.09085
 1.57416  1.17709   1.11434   0.0         1.42188   1.07083  1.89161
 1.22228  1.25201   0.986071  1.24955     1.35888   1.34039  1.40431
 1.48464  0.936854  0.84539   1.25875  …  0.791579  1.72414  1.39207
 1.72296  1.21385   1.21544   1.48213     0.733937  1.61077  1.76282
 1.75528  1.03252   1.15746   1.42188     0.0       1.71115  1.54109
 1.37675  1.63317   1.27542   1.07083     1.71115   0.0      1.7225
 1.06223  1.58261   1.09085   1.89161     1.54109   1.7225   0.0

That’s from a new session where I just installed Distances now and tried it. What are you doing differently from that example? Please go step by step through your own code to find the discrepancy.

2 Likes

It’s a call to pairwise from kmeans.jl (package Clustering) in line 138:

# core k-means skeleton
function _kmeans!(X::AbstractMatrix{<:Real},                # in: data matrix (d x n)
                  weights::Union{Nothing, Vector{<:Real}},  # in: data point weights (n)
                  centers::AbstractMatrix{<:AbstractFloat}, # in/out: matrix of centers (d x k)
                  maxiter::Int,                             # in: maximum number of iterations
                  tol::Float64,                             # in: tolerance of change at convergence
                  displevel::Int,                           # in: the level of display
                  distance::SemiMetric)                     # in: function to calculate distance
...
    # compute pairwise distances, preassign costs and cluster weights
    dmat = pairwise(distance, centers, X, dims=2)
    WC = (weights === nothing) ? Int : eltype(weights)
...

But I can’t see why it’s not defined.

using Clustering,DataFrames,MultivariateStats,Plots,Distances,StatsBase,StatsAPI
#function Estimated(Block)
#store=Vector{Matrix{}}(undef,12)
qwe=ones(1,10)

        model=fit(PCA,data1';#MultivariateStats里面的,对每一列分析
        #maxoutdim=10,inverse=true,#输出尺寸是什么?
        )
        X1=MultivariateStats.transform(model,data1')
    #    store[i]=X1
        #=
        model2 = UMAP_(X1, 3;
        #主要调参对象
        n_neighbors=100,
        min_dist=0.01,
        set_operation_ratio=0.5,

    local_connectivity=1,
    repulsion_strength=1,
    #neg_sample_rate=5,
         metric=SqEuclidean(),
         n_epochs=300,
         learning_rate=1,
         init=:random,
         spread=1,
         neg_sample_rate=5)
         X2=UMAP.transform(model2, X1)
         store[i]=X2
         =#
        for j=2:10#找轮廓系数,因为一次都是1,所以从2开始
            initseeds(:rand,convert(Matrix,X1),j)#初始化随机种子
            dist=Distances.Euclidean()#计算欧几里得距离
            global Xresult=kmeans(X1,j,distance=dist)#result里面包含很多内容distance是哪里看到的?
            dm=Distances.pairwise(dist, X1,  dims=2)#2代表每列之间最大距离,1代表每一行的距离
            est=silhouettes(Xresult,dm)#找轮廓系数
    global qwe[j]=mean(est)
        end

this is my code, I have not found any questions in my code . :sleepy:

This code can be used before, but it was found that it cannot be used during debugging today

It shows a problem with pairwise in Kmeans. Is this a problem with Pkg?

Running the MWE I get

ERROR: LoadError: UndefVarError: data1 not defined

for these lines

        model=fit(PCA,data1';#MultivariateStats里面的,对每一列分析
        #maxoutdim=10,inverse=true,#输出尺寸是什么?
        )

Please check if your MWE runs in a fresh session.

using Clustering 
#dist = Distances.Euclidean()
X1 = rand(10, 10)

#dm = Distances.pairwise(dist, X1, dims=2)
 ppresult=kmeans(X1,2)

the error:

ERROR: LoadError: UndefVarError: pairwise not defined
Stacktrace:
[1] _kmeans!(X::Matrix{Float64}, weights::Nothing, centers::Matrix{Float64}, maxiter::Int64, tol::Float64, displevel::Int64, distance::SqEuclidean)
@ Clustering C:\Users\dell.julia\packages\Clustering\tt9vc\src\kmeans.jl:138
[2] kmeans!(X::Matrix{Float64}, centers::Matrix{Float64}; weights::Nothing, maxiter::Int64, tol::Float64, display::Symbol, distance::SqEuclidean)
@ Clustering C:\Users\dell.julia\packages\Clustering\tt9vc\src\kmeans.jl:70
[3] kmeans(X::Matrix{Float64}, k::Int64; weights::Nothing, init::Symbol, maxiter::Int64, tol::Float64, display::Symbol, distance::SqEuclidean)
@ Clustering C:\Users\dell.julia\packages\Clustering\tt9vc\src\kmeans.jl:112
[4] kmeans(X::Matrix{Float64}, k::Int64)
@ Clustering C:\Users\dell.julia\packages\Clustering\tt9vc\src\kmeans.jl:103
[5] top-level scope
@ untitled-7219c8427da4c703ddc83ce3a14e097e:6
[6] eval
@ .\boot.jl:360 [inlined]
[7] include_string(mapexpr::typeof(identity), mod::Module, code::String, filename::String)
@ Base .\loading.jl:1116
in expression starting at untitled-7219c8427da4c703ddc83ce3a14e097e:6

This works for me yielding

KmeansResult{Matrix{Float64}, Float64, Int64}([0.5188068046007218 0.260315757382139; 0.4245585705784941 0.42545202951360767; … ; 0.48334083662314453 0.5610225098554704; 0.5548758185085466 0.3182716097781758], [1, 1, 1, 2, 2, 2, 2, 1, 2, 1], [0.5699331362713904, 0.426105568802682, 0.7011801938130722, 0.8480519158764617, 0.47559415578909814, 0.6768516041631454, 0.5981374516277773, 0.722932541191037, 0.4262934436027326, 0.45995556295140005], [5, 5], [5, 5], 5.905035574088797, 2, true)

Maybe a versioning problem? Using versioninfo() I see

Julia Version 1.8.0-DEV.1309
Commit 89f23325aa (2022-01-13 19:48 UTC)        
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: Intel(R) Core(TM) i7-10710U CPU @ 1.10GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.0 (ORCJIT, skylake)
Environment:
  JULIA_EDITOR = code
  JULIA_NUM_THREADS = 5

Pkg.status("Clustering") shows

  [aaaa29a8] Clustering v0.14.2

julia> versioninfo()
Julia Version 1.6.2
Commit 1b93d53fc4 (2021-07-14 15:36 UTC)
Platform Info:
OS: Windows (x86_64-w64-mingw32)
CPU: Intel(R) Core™ i7-9750H CPU @ 2.60GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-11.0.1 (ORCJIT, skylake)
Environment:
JULIA_NUM_THREADS = 6

julia> Pkg.status(“Clustering”)
Status C:\Users\dell\.julia\environments\v1.6\Project.toml
[aaaa29a8] Clustering v0.14.2

Should I upgrade my Version of Julia?

Pretty sure this has nothing to do with the Julia version - I just tried is on Julia 1.6.5 (but yes it’s still good to update Julia).

Are you sure you are just running these three lines in a fresh Julia session?

julia> using Clustering

julia> X1 = rand(10, 10);

julia> ppresult = kmeans(X1, 2)

Maybe you could check with the L(ong)T(erm)S(upport) version of Julia (1.6.5), if you do not see a result similar to mine. I just double checked that the code snippet works for me on 1.6.5.

To add to this: please share the results you see.

I just tried on 1.5.3 as well, which works just as well. It seems very unlikely to me that the Julia version would make a difference as long as the package version is the same - more likely that OP still has the same export clash he had at the very start of this thread.

1 Like

julia> using Clustering

julia> X1 = rand(10, 10)
10×10 Matrix{Float64}:
0.740344 0.337929 0.486613 0.33858 0.541362 0.127248 0.778971 0.233203 0.18098 0.688234
0.741174 0.712662 0.0480868 0.603911 0.654863 0.824774 0.323518 0.324083 0.786233 0.614437
0.651201 0.0124087 0.940346 0.421215 0.490089 0.678486 0.529856 0.292362 0.699654 0.0506924
0.318616 0.488626 0.944019 0.713478 0.267164 0.402172 0.483804 0.873122 0.827992 0.51143
0.464809 0.812081 0.170238 0.102422 0.561995 0.354934 0.852781 0.810898 0.667493 0.444468
0.198221 0.871926 0.91747 0.526024 0.971221 0.911117 0.724954 0.112341 0.345794 0.47511
0.474746 0.719516 0.937454 0.693812 0.609287 0.620827 0.0851626 0.878569 0.0325638 0.155254
0.156511 0.840661 0.42345 0.633863 0.860889 0.891205 0.48369 0.934373 0.108588 0.973565
0.980691 0.396536 0.533162 0.386414 0.855655 0.232734 0.218987 0.506902 0.885078 0.859081
0.959835 0.25124 0.261198 0.371961 0.601035 0.822135 0.274006 0.991046 0.305735 0.315071

julia> ppresult = kmeans(X1, 2)
WARNING: both StatsBase and Distances export “pairwise”; uses of it in module Clustering must be qualified
WARNING: both StatsBase and Distances export “pairwise!”; uses of it in module Clustering must be qualified
ERROR: UndefVarError: pairwise not defined
Stacktrace:
[1] _kmeans!(X::Matrix{Float64}, weights::Nothing, centers::Matrix{Float64}, maxiter::Int64, tol::Float64, displevel::Int64, distance::Distances.SqEuclidean)
@ Clustering C:\Users\dell.julia\packages\Clustering\tt9vc\src\kmeans.jl:138
[2] kmeans!(X::Matrix{Float64}, centers::Matrix{Float64}; weights::Nothing, maxiter::Int64, tol::Float64, display::Symbol, distance::Distances.SqEuclidean)
@ Clustering C:\Users\dell.julia\packages\Clustering\tt9vc\src\kmeans.jl:70
[3] kmeans(X::Matrix{Float64}, k::Int64; weights::Nothing, init::Symbol, maxiter::Int64, tol::Float64, display::Symbol, distance::Distances.SqEuclidean)
@ Clustering C:\Users\dell.julia\packages\Clustering\tt9vc\src\kmeans.jl:112
[4] kmeans(X::Matrix{Float64}, k::Int64)
@ Clustering C:\Users\dell.julia\packages\Clustering\tt9vc\src\kmeans.jl:103
[5] top-level scope
@ none:1

But how does it complain about StatsBase exports if you’re just doing using Clustering?

1 Like

I don’t know why?