Broken tests in fresh install of Julia with StatsBase and Clustering packages.

question
package

#1

I’m brand new to Julia, just recently having installed it on two separate mac osx 10.14.1 machines with fresh installs of Julia 1.0.3 (one via mac app bundle and the other via homebrew), and going through basic tutorials everything seems fine. In both cases, however, packages StatsBase (v0.27.0) and Clustering (v0.12.1) report some broken tests when running pkg> test XXX (replace XXX with StatsBase or Clustering). I tried removing and re-adding the packages and also updating them, but no change. I’ve raised these issues on the respective package GitHub pages, which are part of JuliaStats, but was advised that there is no problem with these packages on their end and that I should bring the discussion here as these issues may reflect a problem in a more fundamental component of Julia, whatever that might mean. Although a few broken tests may not seem like huge deal, as a new user I’m highly concerned about migrating to Julia when what I consider fundamental stats packages work for the package maintainer but throw errors on my brand new install. Any help or advice would be greatly appreciated. The errors are:

JuliaStats: 1 broken test reported for Ambiguities
Clustering: 1 broken test reported for each of silhouettes, MCL and V-measure


#2

Ref https://github.com/JuliaStats/StatsBase.jl/issues/441 and https://github.com/JuliaStats/Clustering.jl/issues/135.

Could you copy paste the entire stack trace for the failing tests?


#3
(v1.0) pkg> test StatsBase
   Testing StatsBase
 Resolving package versions...
    Status `/var/folders/w7/3dhp9vkd72s8phhgsbjcbn680000gn/T/tmpnV5HJF/Manifest.toml`
  [7d9fca2a] Arpack v0.3.0
  [9e28174c] BinDeps v0.8.10
  [b99e7846] BinaryProvider v0.5.3
  [324d7699] CategoricalArrays v0.5.2
  [944b1d66] CodecZlib v0.5.1
  [34da2185] Compat v1.4.0
  [a93c6f00] DataFrames v0.16.0
  [9a8bc11e] DataStreams v0.4.1
  [864edb3b] DataStructures v0.14.0
  [31c24e10] Distributions v0.16.4
  [38e38edf] GLM v1.0.2
  [82899510] IteratorInterfaceExtensions v0.1.1
  [e1d29d7a] Missings v0.3.1
  [bac558e1] OrderedCollections v1.0.2
  [90014a1f] PDMats v0.9.6
  [1fd47b50] QuadGK v2.0.3
  [189a3867] Reexport v0.2.0
  [ae029012] Requires v0.5.2
  [79098fc4] Rmath v0.5.0
  [a2af1166] SortingAlgorithms v0.3.1
  [276daf66] SpecialFunctions v0.7.2
  [2913bbd2] StatsBase v0.27.0
  [4c63d2b9] StatsFuns v0.7.0
  [3eaba693] StatsModels v0.3.1
  [3783bdb8] TableTraits v0.4.1
  [bd369af6] Tables v0.1.14
  [3bb67fe8] TranscodingStreams v0.8.1
  [30578b45] URIParser v0.4.0
  [ea10d353] WeakRefStrings v0.5.3
  [2a0f44e3] Base64  [`@stdlib/Base64`]
  [ade2ca70] Dates  [`@stdlib/Dates`]
  [8bb1440f] DelimitedFiles  [`@stdlib/DelimitedFiles`]
  [8ba89e20] Distributed  [`@stdlib/Distributed`]
  [9fa8497b] Future  [`@stdlib/Future`]
  [b77e0a4c] InteractiveUtils  [`@stdlib/InteractiveUtils`]
  [76f85450] LibGit2  [`@stdlib/LibGit2`]
  [8f399da3] Libdl  [`@stdlib/Libdl`]
  [37e2e46d] LinearAlgebra  [`@stdlib/LinearAlgebra`]
  [56ddb016] Logging  [`@stdlib/Logging`]
  [d6f4376e] Markdown  [`@stdlib/Markdown`]
  [a63ad114] Mmap  [`@stdlib/Mmap`]
  [44cfe95a] Pkg  [`@stdlib/Pkg`]
  [de0858da] Printf  [`@stdlib/Printf`]
  [3fa0cd96] REPL  [`@stdlib/REPL`]
  [9a3f8284] Random  [`@stdlib/Random`]
  [ea8e919c] SHA  [`@stdlib/SHA`]
  [9e88b42a] Serialization  [`@stdlib/Serialization`]
  [1a1011a3] SharedArrays  [`@stdlib/SharedArrays`]
  [6462fe0b] Sockets  [`@stdlib/Sockets`]
  [2f01184e] SparseArrays  [`@stdlib/SparseArrays`]
  [10745b16] Statistics  [`@stdlib/Statistics`]
  [4607b0f0] SuiteSparse  [`@stdlib/SuiteSparse`]
  [8dfed614] Test  [`@stdlib/Test`]
  [cf7118a7] UUIDs  [`@stdlib/UUIDs`]
  [4ec0a83e] Unicode  [`@stdlib/Unicode`]
Running tests:
 * ambiguous.jl ...
Skipping StatsBase.findat
Skipping StatsBase.hist
Skipping StatsBase.wmean!
Skipping Base.active_repl
Skipping Base.active_repl_backend
Test Summary: | Broken  Total
Ambiguities   |      1      1
 * weights.jl ...
Test Summary:     |  Pass  Total
StatsBase.Weights | 10946  10946
 * moments.jl ...
Test Summary:     | Pass  Total
StatsBase.Moments |  356    356
 * scalarstats.jl ...
 * deviation.jl ...
 * cov.jl ...
Test Summary:        | Pass  Total
StatsBase.Covariance |  308    308
 * counts.jl ...
Test Summary: | Pass  Total
views         |    1      1
 * ranking.jl ...
 * empirical.jl ...
Test Summary: | Pass  Total
ECDF          |    6      6
 * hist.jl ...
Test Summary:       | Pass  Total
StatsBase.Histogram |  103    103
 * rankcorr.jl ...
 * signalcorr.jl ...
 * misc.jl ...
 * robust.jl ...
 * sampling.jl ...
Test Summary:  | Pass  Total
sampling pairs |    4      4
 * wsampling.jl ...
 * statmodels.jl ...
 * partialcor.jl ...
   Testing StatsBase tests passed


(v1.0) pkg> test Clustering
   Testing Clustering
 Resolving package versions...
    Status `/var/folders/w7/3dhp9vkd72s8phhgsbjcbn680000gn/T/tmpF0dzbE/Manifest.toml`
  [b99e7846] BinaryProvider v0.5.3
  [aaaa29a8] Clustering v0.12.1
  [944b1d66] CodecZlib v0.5.1
  [864edb3b] DataStructures v0.14.0
  [b4f34e82] Distances v0.7.4
  [e1d29d7a] Missings v0.3.1
  [b8a86587] NearestNeighbors v0.4.2
  [bac558e1] OrderedCollections v1.0.2
  [a2af1166] SortingAlgorithms v0.3.1
  [90137ffa] StaticArrays v0.10.2
  [2913bbd2] StatsBase v0.27.0
  [3bb67fe8] TranscodingStreams v0.8.1
  [2a0f44e3] Base64  [`@stdlib/Base64`]
  [ade2ca70] Dates  [`@stdlib/Dates`]
  [8bb1440f] DelimitedFiles  [`@stdlib/DelimitedFiles`]
  [8ba89e20] Distributed  [`@stdlib/Distributed`]
  [b77e0a4c] InteractiveUtils  [`@stdlib/InteractiveUtils`]
  [76f85450] LibGit2  [`@stdlib/LibGit2`]
  [8f399da3] Libdl  [`@stdlib/Libdl`]
  [37e2e46d] LinearAlgebra  [`@stdlib/LinearAlgebra`]
  [56ddb016] Logging  [`@stdlib/Logging`]
  [d6f4376e] Markdown  [`@stdlib/Markdown`]
  [a63ad114] Mmap  [`@stdlib/Mmap`]
  [44cfe95a] Pkg  [`@stdlib/Pkg`]
  [de0858da] Printf  [`@stdlib/Printf`]
  [3fa0cd96] REPL  [`@stdlib/REPL`]
  [9a3f8284] Random  [`@stdlib/Random`]
  [ea8e919c] SHA  [`@stdlib/SHA`]
  [9e88b42a] Serialization  [`@stdlib/Serialization`]
  [6462fe0b] Sockets  [`@stdlib/Sockets`]
  [2f01184e] SparseArrays  [`@stdlib/SparseArrays`]
  [10745b16] Statistics  [`@stdlib/Statistics`]
  [8dfed614] Test  [`@stdlib/Test`]
  [cf7118a7] UUIDs  [`@stdlib/UUIDs`]
  [4ec0a83e] Unicode  [`@stdlib/Unicode`]
Runing tests:
* seeding.jl ...
Test Summary: | Pass  Total
seeding       |   23     23
* kmeans.jl ...
Test Summary:      | Pass  Total
kmeans() (k-means) |   44     44
* kmedoids.jl ...
Test Summary:          | Pass  Total
kmedoids() (k-medoids) |   15     15
* affprop.jl ...
Test Summary:                         | Pass  Total
affinityprop() (affinity propagation) |   68     68
* dbscan.jl ...
Test Summary:                | Pass  Total
dbscan() (DBSCAN clustering) |   14     14
* fuzzycmeans.jl ...
Test Summary:  | Pass  Total
fuzzy_cmeans() |   15     15
* silhouette.jl ...
Test Summary: | Pass  Broken  Total
silhouettes() |    8       1      9
* varinfo.jl ...
Test Summary:                       | Pass  Total
varinfo() (variational information) |    9      9
* randindex.jl ...
Test Summary:            | Pass  Total
randindex() (Rand index) |   13     13
* hclust.jl ...
Test Summary:                      | Pass  Total
hclust() (hierarchical clustering) | 6650   6650
* mcl.jl ...
Test Summary: | Pass  Broken  Total
MCL           |   29       1     30
* vmeasure.jl ...
Test Summary: | Pass  Broken  Total
V-measure     |    8       1      9
   Testing Clustering tests passed

#4

Oh, I see. The testset is passing (that’s why it says “Testing Clustering tests passed” at the end).

You’re seeing the “Broken” tests. Those are specifically tests which have been marked as @test_broken. That is, they are known to not pass, and are just included in the testset by the developers as a reminder for future work. All software has issues, missing features, or edge cases, so the fact that there are some issues which the authors are aware of really doesn’t indicate anything being wrong with the package.

If you’re really interested, you could look for instances of @test_broken in the codebase and learn about what the known issues are. But the presence of known issues shouldn’t need to affect your confidence in the package.


#5

I see. That’s good to hear. As a new user I’d recommend modifying the test reports to help avoid confusion. I did notice the “tests passed” summary at the bottom, but that didn’t make sense to me given the broken tests reported prior. I suspect that some sort of clarification, however brief, about what the broken tests mean in the summary would be really helpful for most users.