Tuple as argument for size(AbstractArray)

It would be great to have an alternative way of calling size with a tuple of dimensions.
Is there a specific reason that this does not exist in Base?
The function could look like

function size(x::AbstractArray{T},dim::NTuple{N,Int}; keep_dims=true) where{T,N}
    if ~keep_dims
        return map(n->size(x,n),dim)
    end
    sz=ones(Int, ndims(x))
    for n in dim
        sz[n]=size(x,n) 
    end
    return Tuple(sz)
end 

It can be used like this:

julia> x=ones(10,11,12);

julia> size(x,(2,3))
(1, 11, 12)

julia> size(x,(3,2), keep_dims=false)
(12, 11)
1 Like

FWIW,

julia> x=ones(10,11,12);

julia> size.(Ref(x),(2,3))
(11, 12)

julia> size.(Ref(x),(3,2))
(12, 11)
4 Likes

Well, that’s one reason. That works backwards from what I’d expect. :slight_smile:

4 Likes

Not quite sure I understand. The idea was that if keepdims is true, it returns a size vector that is not squeezed in any way. I.e. with the original number of dimensions. Yet, if keepdims is false it is intentional that you obtain a tuple with the sizes in the order you are asking. Even multiple repetitions of the same size being allowed. Both of these modes are quite useful when it comes to writing array classes and alike.

The point is that for something to be incorporated into Base, we’ve all gotta agree on how it’s going to work, and ideally it should be unambiguous. I’d expect size(x, (2, 3)) to just be (11, 12). I know you have an optional argument there, but that doesn’t really resolve the tension… it just makes it more complicated.

We’ve added optimized support for size(x)[2:end] for precisely these sorts of use-cases. Also, I’d point you towards working with axes and/or CartesianIndices a bit more.

7 Likes

Generally, to have something in Base, you need compelling reasons in favor of, instead of a lack of reasons against including it. Also, Julia’s design favors flexible building blocks which you, the programmer, can combine to get what you need.

That said, in addition to broadcasting as suggested by @carstenbauer, you can also just destructure the result:

_, d2, d3 = size(x) # assuming you need d2 and d3 in a calculation

or use indexing:

size(x)[[2,3]]

There are a zillion other possibilities depending on whether 2 and 3 are compile-time constants.

7 Likes

Let me try to explain, why I think there are compelling reasons for such an addition: I mostly do image processing in multiple dimensions. Each of these dimension typically is attached to a meaning (e.g. XYZ over time and spectral dimension). In my experience in some languages (Matlab, Python) which I used, I am constantly annoyed by the operation such as sub-slicing or reduce operations starting to change the order of dimension. One may want code to run on 2D data or 3D data likewise and the outcome of the processing should depend on the meaning that these dimensions have and not on the fact whether a dimension happens to be singleton or not. In my experience the automatic performance of “squeeze” (dropdims()) operations is cause for lots of code clutter and trouble when writing code that is supposed to work for 2D and 3D data alike. In NumPy this can be largely avoided by the ´keepdims=true´ argument of reduce functions. Julia very nicely has this (keepdims=true) behaviour as default for reduce operations (sum, maximum, etc.), yet for selectdim() singletons are dropped and for sub-indexing you need to specify a range to keep singleton dimensions:

julia> x=ones(4,5,6);
julia> size(sum(x,dims=2))
(4, 1, 6)
julia> size(x[:,1,:])
(4, 6)
julia> size(x[:,1:1,:])
(4, 1, 6)
julia> size(selectdim(x,2,1))
(4, 6)

Adding a tuple as a possible way of calling size(x, dims) sounds like a natural and useful addition to me. Depending whether you want this by default to agree to the size you get for selectdim() or for reduce operations is in my view a matter of taste, my preference being the latter one. Another useful addition would be to add keepdims=false as an optional argument to selectdim().

Again, I understand that you have a use case for this, but you can just code a function for it if you need it often, or use the solutions above.

Doing transformations in arguments leads to a combinatorial explosion (everyone has a different use case) and is best avoided.

3 Likes

Shouldn’t optimizations like that live in some sort of tensor library? It make sense to have some very special multidimensional functions or special versions of functions for multidimensional arrays in corresponding package.

Good point. Maybe instead of introducing the keepdims=true argument, one should use a separate function such as size_d(x, dims::NTuple) which returns the size keeping the singleton dimensions and selectdims_d(x, dims::NTuple).
Where should it live? I think ideally in AbstractArray, as I think it may of high general use, but I understand that this may not necessary be the general consensus.

Again, I am wondering if you read @carstenbauer’s suggestion for

size.(Ref(x),(2,3))

It is compact, neat, and requires nothing extra to add to the language.

Here, the secret sauce is broadcasting, which is why in Julia you rarely see “vector” and “scalar” versions for the same function.

You can also use

@. size($x, (2, 3))

It does not get any better than this: you get the functionality you need from composing existing building blocks.

3 Likes

If one thinks longer about that, it is indeed nice, but in the first impression, Ref felt quite strange to me.
But if the best method, then it’s probably okay.

However, I believe, the more critical from the part above is:

julia> size(x,(2,3))
(1, 11, 12)

which cannot be written as easy as the other one (can it?)

1 Like

A lot of things do not have a built-in, but can be easily coded. Eg

squash′em(x::NTuple{N}, j) where N =
    map((x, i) -> i ∈ j ? x : 1, x, ntuple(identity, Val(N)))
squash′em(size(x), (2, 3))
1 Like

Yeah, there have been discussion about whether we should have a a function/struct with a better name, like Scalar or similar, to indicate that something is a scalar under broadcasting. Maybe it’s worth reactivating the discussion?

PS: I didn’t know about interpolation with @.. That’s nice!

2 Likes

Is this a new feature? Doesn’t work for me on julia 1.5/6

julia> A = rand(2,2);

julia> @. size(A, (1,))
2×2 Array{Int64,2}:
 1  1
 1  1

julia> @. size($A, (1,))
2×2 Array{Int64,2}:
 1  1
 1  1

1 Like

My bad, I forgot that $ escapes function calls, not values.

There was the idea of adding the syntax &A as shorthand for Ref(A) (see #6080 and #34693). I would love to see this implemented.

@roflmaostc maybe size.(&x, (2,3)) would look less strange?

I believe it feels strange because I cannot associate why a reference would help in broadcasting. For me these terms are not directly connect.
As @carstenbauer mentioned, Scalar would make much more sense (at least for me)

2 Likes

Agreed… But I thought maybe people would find it easier getting used to &x for opting out of broadcasting since it doesn’t put “reference” in your face quite as much as Ref :slight_smile:

2 Likes

I must say I would find that very unnatural, for one, that the length of the returned tuple is different from the input tuple, and secondly, that 1 isn’t a ‘special’ enough value to work as a placeholder.

But I also think that size(x, (2,3)) should be possible, and return the size of those dimensions, it feels more consistent to me. If the preferred solution is size(x)[[2,3]], then size(x)[2] should replace size(x, 2). Messing about with Ref and broadcasting just seems too awkward.

As a data point, Matlab supports size(x, [1, 3]).