Tuple as argument for size(AbstractArray)

RainerHeintzmann · March 19, 2021, 12:18pm

It would be great to have an alternative way of calling size with a tuple of dimensions.
Is there a specific reason that this does not exist in Base?
The function could look like

function size(x::AbstractArray{T},dim::NTuple{N,Int}; keep_dims=true) where{T,N}
    if ~keep_dims
        return map(n->size(x,n),dim)
    end
    sz=ones(Int, ndims(x))
    for n in dim
        sz[n]=size(x,n) 
    end
    return Tuple(sz)
end

It can be used like this:

julia> x=ones(10,11,12);

julia> size(x,(2,3))
(1, 11, 12)

julia> size(x,(3,2), keep_dims=false)
(12, 11)

carstenbauer · March 19, 2021, 1:37pm

FWIW,

julia> x=ones(10,11,12);

julia> size.(Ref(x),(2,3))
(11, 12)

julia> size.(Ref(x),(3,2))
(12, 11)

mbauman · March 19, 2021, 1:41pm

Well, that’s one reason. That works backwards from what I’d expect.

RainerHeintzmann · March 19, 2021, 3:01pm

Not quite sure I understand. The idea was that if keepdims is true, it returns a size vector that is not squeezed in any way. I.e. with the original number of dimensions. Yet, if keepdims is false it is intentional that you obtain a tuple with the sizes in the order you are asking. Even multiple repetitions of the same size being allowed. Both of these modes are quite useful when it comes to writing array classes and alike.

mbauman · March 19, 2021, 3:22pm

The point is that for something to be incorporated into Base, we’ve all gotta agree on how it’s going to work, and ideally it should be unambiguous. I’d expect size(x, (2, 3)) to just be (11, 12). I know you have an optional argument there, but that doesn’t really resolve the tension… it just makes it more complicated.

We’ve added optimized support for size(x)[2:end] for precisely these sorts of use-cases. Also, I’d point you towards working with axes and/or CartesianIndices a bit more.

Tamas_Papp · March 19, 2021, 4:11pm

Generally, to have something in Base, you need compelling reasons in favor of, instead of a lack of reasons against including it. Also, Julia’s design favors flexible building blocks which you, the programmer, can combine to get what you need.

That said, in addition to broadcasting as suggested by @carstenbauer, you can also just destructure the result:

_, d2, d3 = size(x) # assuming you need d2 and d3 in a calculation

or use indexing:

size(x)[[2,3]]

There are a zillion other possibilities depending on whether 2 and 3 are compile-time constants.

RainerHeintzmann · March 19, 2021, 6:14pm

Let me try to explain, why I think there are compelling reasons for such an addition: I mostly do image processing in multiple dimensions. Each of these dimension typically is attached to a meaning (e.g. XYZ over time and spectral dimension). In my experience in some languages (Matlab, Python) which I used, I am constantly annoyed by the operation such as sub-slicing or reduce operations starting to change the order of dimension. One may want code to run on 2D data or 3D data likewise and the outcome of the processing should depend on the meaning that these dimensions have and not on the fact whether a dimension happens to be singleton or not. In my experience the automatic performance of “squeeze” (dropdims()) operations is cause for lots of code clutter and trouble when writing code that is supposed to work for 2D and 3D data alike. In NumPy this can be largely avoided by the ´keepdims=true´ argument of reduce functions. Julia very nicely has this (keepdims=true) behaviour as default for reduce operations (sum, maximum, etc.), yet for selectdim() singletons are dropped and for sub-indexing you need to specify a range to keep singleton dimensions:

julia> x=ones(4,5,6);
julia> size(sum(x,dims=2))
(4, 1, 6)
julia> size(x[:,1,:])
(4, 6)
julia> size(x[:,1:1,:])
(4, 1, 6)
julia> size(selectdim(x,2,1))
(4, 6)

Adding a tuple as a possible way of calling size(x, dims) sounds like a natural and useful addition to me. Depending whether you want this by default to agree to the size you get for selectdim() or for reduce operations is in my view a matter of taste, my preference being the latter one. Another useful addition would be to add keepdims=false as an optional argument to selectdim().

Tamas_Papp · March 21, 2021, 10:19am

Again, I understand that you have a use case for this, but you can just code a function for it if you need it often, or use the solutions above.

Doing transformations in arguments leads to a combinatorial explosion (everyone has a different use case) and is best avoided.

Skoffer · March 21, 2021, 11:04am

Shouldn’t optimizations like that live in some sort of tensor library? It make sense to have some very special multidimensional functions or special versions of functions for multidimensional arrays in corresponding package.

RainerHeintzmann · March 22, 2021, 12:17pm

Good point. Maybe instead of introducing the keepdims=true argument, one should use a separate function such as size_d(x, dims::NTuple) which returns the size keeping the singleton dimensions and selectdims_d(x, dims::NTuple).
Where should it live? I think ideally in AbstractArray, as I think it may of high general use, but I understand that this may not necessary be the general consensus.

Tamas_Papp · March 22, 2021, 12:31pm

Again, I am wondering if you read @carstenbauer’s suggestion for

size.(Ref(x),(2,3))

It is compact, neat, and requires nothing extra to add to the language.

Here, the secret sauce is broadcasting, which is why in Julia you rarely see “vector” and “scalar” versions for the same function.

You can also use

@. size($x, (2, 3))

It does not get any better than this: you get the functionality you need from composing existing building blocks.

roflmaostc · March 22, 2021, 12:39pm

If one thinks longer about that, it is indeed nice, but in the first impression, Ref felt quite strange to me.
But if the best method, then it’s probably okay.

However, I believe, the more critical from the part above is:

julia> size(x,(2,3))
(1, 11, 12)

which cannot be written as easy as the other one (can it?)

Tamas_Papp · March 22, 2021, 12:55pm

A lot of things do not have a built-in, but can be easily coded. Eg

squash′em(x::NTuple{N}, j) where N =
    map((x, i) -> i ∈ j ? x : 1, x, ntuple(identity, Val(N)))
squash′em(size(x), (2, 3))

carstenbauer · March 22, 2021, 1:01pm

Yeah, there have been discussion about whether we should have a a function/struct with a better name, like Scalar or similar, to indicate that something is a scalar under broadcasting. Maybe it’s worth reactivating the discussion?

PS: I didn’t know about interpolation with @.. That’s nice!

tomerarnon · March 22, 2021, 4:41pm

Is this a new feature? Doesn’t work for me on julia 1.5/6

julia> A = rand(2,2);

julia> @. size(A, (1,))
2×2 Array{Int64,2}:
 1  1
 1  1

julia> @. size($A, (1,))
2×2 Array{Int64,2}:
 1  1
 1  1

Tamas_Papp · March 23, 2021, 9:29am

My bad, I forgot that $ escapes function calls, not values.

sijo · March 23, 2021, 10:47am

There was the idea of adding the syntax &A as shorthand for Ref(A) (see #6080 and #34693). I would love to see this implemented.

@roflmaostc maybe size.(&x, (2,3)) would look less strange?

roflmaostc · March 23, 2021, 10:54am

I believe it feels strange because I cannot associate why a reference would help in broadcasting. For me these terms are not directly connect.
As @carstenbauer mentioned, Scalar would make much more sense (at least for me)

sijo · March 23, 2021, 11:02am

Agreed… But I thought maybe people would find it easier getting used to &x for opting out of broadcasting since it doesn’t put “reference” in your face quite as much as Ref

DNF · March 23, 2021, 11:16am

I must say I would find that very unnatural, for one, that the length of the returned tuple is different from the input tuple, and secondly, that 1 isn’t a ‘special’ enough value to work as a placeholder.

But I also think that size(x, (2,3)) should be possible, and return the size of those dimensions, it feels more consistent to me. If the preferred solution is size(x)[[2,3]], then size(x)[2] should replace size(x, 2). Messing about with Ref and broadcasting just seems too awkward.

As a data point, Matlab supports size(x, [1, 3]).

Topic		Replies	Views
The best way to create a AbstractArray with known first demension size New to Julia	1	189	September 28, 2022
Because size() doesn't work General Usage question	6	1852	April 21, 2020
`length.(AbstractArray[])` and `size.(AbstractArray[])` return empty arrays of `Any` General Usage	19	527	March 11, 2022
Incomplete tuples General Usage	5	686	October 4, 2017
Why size(vector, 2) return 1? General Usage question	2	299	October 23, 2022

Tuple as argument for size(AbstractArray)

Related topics