It would be great to have an alternative way of calling size with a tuple of dimensions.
Is there a specific reason that this does not exist in Base?
The function could look like
function size(x::AbstractArray{T},dim::NTuple{N,Int}; keep_dims=true) where{T,N}
if ~keep_dims
return map(n->size(x,n),dim)
end
sz=ones(Int, ndims(x))
for n in dim
sz[n]=size(x,n)
end
return Tuple(sz)
end
Not quite sure I understand. The idea was that if keepdims is true, it returns a size vector that is not squeezed in any way. I.e. with the original number of dimensions. Yet, if keepdims is false it is intentional that you obtain a tuple with the sizes in the order you are asking. Even multiple repetitions of the same size being allowed. Both of these modes are quite useful when it comes to writing array classes and alike.
The point is that for something to be incorporated into Base, weâve all gotta agree on how itâs going to work, and ideally it should be unambiguous. Iâd expect size(x, (2, 3)) to just be (11, 12). I know you have an optional argument there, but that doesnât really resolve the tension⌠it just makes it more complicated.
Weâve added optimized support for size(x)[2:end] for precisely these sorts of use-cases. Also, Iâd point you towards working with axes and/or CartesianIndices a bit more.
Generally, to have something in Base, you need compelling reasons in favor of, instead of a lack of reasons against including it. Also, Juliaâs design favors flexible building blocks which you, the programmer, can combine to get what you need.
That said, in addition to broadcasting as suggested by @carstenbauer, you can also just destructure the result:
_, d2, d3 = size(x) # assuming you need d2 and d3 in a calculation
or use indexing:
size(x)[[2,3]]
There are a zillion other possibilities depending on whether 2 and 3 are compile-time constants.
Let me try to explain, why I think there are compelling reasons for such an addition: I mostly do image processing in multiple dimensions. Each of these dimension typically is attached to a meaning (e.g. XYZ over time and spectral dimension). In my experience in some languages (Matlab, Python) which I used, I am constantly annoyed by the operation such as sub-slicing or reduce operations starting to change the order of dimension. One may want code to run on 2D data or 3D data likewise and the outcome of the processing should depend on the meaning that these dimensions have and not on the fact whether a dimension happens to be singleton or not. In my experience the automatic performance of âsqueezeâ (dropdims()) operations is cause for lots of code clutter and trouble when writing code that is supposed to work for 2D and 3D data alike. In NumPy this can be largely avoided by the ´keepdims=true´ argument of reduce functions. Julia very nicely has this (keepdims=true) behaviour as default for reduce operations (sum, maximum, etc.), yet for selectdim() singletons are dropped and for sub-indexing you need to specify a range to keep singleton dimensions:
Adding a tuple as a possible way of calling size(x, dims) sounds like a natural and useful addition to me. Depending whether you want this by default to agree to the size you get for selectdim() or for reduce operations is in my view a matter of taste, my preference being the latter one. Another useful addition would be to add keepdims=false as an optional argument to selectdim().
Shouldnât optimizations like that live in some sort of tensor library? It make sense to have some very special multidimensional functions or special versions of functions for multidimensional arrays in corresponding package.
Good point. Maybe instead of introducing the keepdims=true argument, one should use a separate function such as size_d(x, dims::NTuple) which returns the size keeping the singleton dimensions and selectdims_d(x, dims::NTuple).
Where should it live? I think ideally in AbstractArray, as I think it may of high general use, but I understand that this may not necessary be the general consensus.
If one thinks longer about that, it is indeed nice, but in the first impression, Ref felt quite strange to me.
But if the best method, then itâs probably okay.
However, I believe, the more critical from the part above is:
julia> size(x,(2,3))
(1, 11, 12)
which cannot be written as easy as the other one (can it?)
Yeah, there have been discussion about whether we should have a a function/struct with a better name, like Scalar or similar, to indicate that something is a scalar under broadcasting. Maybe itâs worth reactivating the discussion?
PS: I didnât know about interpolation with @.. Thatâs nice!
I believe it feels strange because I cannot associate why a reference would help in broadcasting. For me these terms are not directly connect.
As @carstenbauer mentioned, Scalar would make much more sense (at least for me)
Agreed⌠But I thought maybe people would find it easier getting used to &x for opting out of broadcasting since it doesnât put âreferenceâ in your face quite as much as Ref
I must say I would find that very unnatural, for one, that the length of the returned tuple is different from the input tuple, and secondly, that 1 isnât a âspecialâ enough value to work as a placeholder.
But I also think that size(x, (2,3)) should be possible, and return the size of those dimensions, it feels more consistent to me. If the preferred solution is size(x)[[2,3]], then size(x)[2] should replace size(x, 2). Messing about with Ref and broadcasting just seems too awkward.