How to find the size of a Set?

A very basic question:

julia> S = Set([1,2,3,4])
Set([4, 2, 3, 1])

julia> size(S)
ERROR: MethodError: no method matching size(::Set{Int64})
Closest candidates are:
  size(::BitArray{1}) at bitarray.jl:77
  size(::BitArray{1}, ::Any) at bitarray.jl:81
  size(::Core.Compiler.StmtRange) at show.jl:1561
  ...
Stacktrace:
 [1] top-level scope at none:0

How can I find the number of elements in a Set?

Answering my own question: for some reason, the number of elements in a Set is called “length”:

julia> length(S)
4
1 Like

The reason is that Set is a collection.

https://docs.julialang.org/en/v1/base/collections/#Set-Like-Collections-1

I am unsure of the sense in which that constitutes a reason. But it doesn’t matter. The name of the method is slightly non-obvious, but presumably this page will now come up if someone else googles it.

2 Likes

As the documentation for size tells you,

 Return a tuple containing the dimensions of A. 

the set has no dimensions (shape), so size is not the right abstraction. length will always give you the number of elements, and for that reason it applies to Set.

1 Like

Yes, but logically the term “length” doesn’t make sense for a set, because, well, sets aren’t the sort of things that have length. So the name “length” is not so easy to guess if you don’t already know it. But I did guess it all the same, and posted my answer, so that’s the end of the problem really. Nothing more to see here.

As @pkofod pointed out, length is for the number of element in a collection. Set is a collection, so it has length.

Whether it is intuitive or not is of course subjective, but perhaps as you learn Julia you will find it reasonable. One advantage is that once you learn about length (which comes way before in the docs), you can apply it to all sorts of collections — it is in this sense that length is said to be generic.

Generic building blocks are key to using Julia efficiently: you can write code that should work for a wider range of collections, regardless of whether they are Sets. Eg

julia> function dont_like_4!(x)
           4 ∈ x && empty!(x)
           x
       end
dont_like_4! (generic function with 1 method)

julia> dont_like_4!([1,2,3])
3-element Array{Int64,1}:
 1
 2
 3

julia> dont_like_4!(Set([1,2,3]))
Set([2, 3, 1])

julia> dont_like_4!([1,2,3,4])
0-element Array{Int64,1}

julia> dont_like_4!(Set([1,2,3,4]))
Set(Int64[])

Finally, I am curious if you have an alternative suggestion that would be more easily discoverable.

FWIW, you could make a PR that overloads size(s::Set) and returns a better error message that points users to length. Don’t know if this will be accepted though.

1 Like

No, no, look, you’re all dragging this out way longer than it needs to be. I understand perfectly well that there’s an internal consistency to the naming of methods in any language, and that in this case, the logical choice was ‘length’ and not ‘size’. The only problem was that ‘length’ was hard to guess, and the documentation was hard to find. (Try googling “julia set”.)

These problems are solved by the existence of the first two posts in this thread, which will hopefully appear when someone else googles the same question. There is really no need for any further discussion at all.

1 Like

Matlab does some strange things with the length function; for example, for an array, A, it is equal to max(size(A)).

But they have a numel function which is basically the same as Julia’s length. I’m not sure how discoverable it would be, except for former Matlab users, but it actually makes more sense to me than length.

I suppose it would go well along with eltype