The verbosity of dropdims…dims…dims (was: range improvement for 1.1)


#1

It would be nice if this could be something like sum(a, dims=1, drop=true) for all functions which take dims.


the `range` improvement for 1.1
#2

My understanding is that this would not be type stable.


#3

I get that a generic squeeze function can’t be, but why would this be worse than dropdims itself?

Edit: Ah I see now, sorry. A keyword drop=(2,3) vs drop=nothing etc could let the type system know, but the value drop=true can’t. Unless all this new fancy constant propagation business can save us…


#4

Perhaps you missed https://github.com/JuliaLang/julia/pull/28708

I know, and I said before it is indeed better, but still not as nice as linspace was. LinRange is a good substitution though.

It would be nice if this could be something like sum(a, dims=1, drop=true) for all functions which take dims.

That would not be typestable if understood as “drop all singleton dimensions”, but if dropping only the dimensions specified by dims, with constant propagation I think that would be fine?

mbauman suggested sum(a, squeeze=1) (which would now perhaps be spelled sum(a,dropdims=1)) in https://github.com/JuliaLang/julia/issues/16606 , which is a nice idea (although perhaps not very intuitive at first). StefanKarpinski suggested dropdims(sum, a, dims=1) which is nice and composeable, but it does lose something in translation (the main action is that I want to sum that array, not that I want to drop some dimensions, and therefore sum should be the first thing appearing in the expression).


#5

FWIW I would go with something like

sum(A, Drop(2, 3))
sum(A, Squeeze(2, 3))

where Drop and Squeeze wrap a tuple with a convenience splat syntax.


#6
function sumdrop(a; dims, drop=false)
    s = sum(a,dims=dims)
    if drop
        return dropdims(s, dims=dims)
    end
    return s
end

function testsum(a)
    sumdrop(a,dims=2, drop=true)
end

@code_warntype testsum(randn(2,2))

Body::Union{Array{Float64,1}, Array{Float64,2}}
11 1 ── %1  = Main.sumdrop::Core.Compiler.Const(sumdrop, false)                       │
   │    %2  = (Base.sle_int)(1, 1)::Bool                                              │╻╷╷╷╷  #sumdrop
   └───       goto #3 if not %2                                                       ││┃│││   isempty
   2 ── %4  = (Base.sle_int)(1, 0)::Bool                                              │││┃│││   iterate
   └───       goto #4                                                                 ││││┃│     iterate
   3 ──       nothing                                                                 │
   4 ┄─ %7  = φ (#2 => %4, #3 => false)::Bool                                         │││││┃      iterate
   └───       goto #6 if not %7                                                       ││││││
   5 ──       invoke Base.getindex(()::Tuple{}, 1::Int64)                             ││││││
   └───       $(Expr(:unreachable))                                                   ││││││
   6 ──       goto #8                                                                 ││││││
   7 ──       $(Expr(:unreachable))                                                   ││││││
   8 ┄─       goto #9                                                                 │││││
   9 ──       goto #10                                                                │││╻      iterate
   10 ─       goto #11                                                                │││
   11 ─ %16 = invoke Main.:(#sumdrop#21)(2::Int64, true::Bool, %1::Function, _2::Array{Float64,2})::Union{Array{Float64,1}, Array{Float64,2}}
   └───       goto #12                                                                ││
   12 ─       return %16 

constant propagation does not seem to solve the issue in this case


#7

Indeed, the compiler is not clever enough to do that, too bad. Moving the arguments as positional and using @inline does make it infer though.


#8

Thanks for all these. This seems pretty clean, from June I see https://github.com/JuliaLang/julia/issues/16606#issuecomment-398086850

This one could be a package:


#9

I agree, but I don’t have for this ATM, anyone interested should feel free to go for it.


#10

Given that dims is now in the function name, I could get behind a positional argument: dropdims(A, 2) and even dropdims(sum, A, 2).

This is one of those cases where we added the dims keyword when it was still called squeeze… and then realized that dropdims was a better name in the first place. I’m not sure we would have ended up here had the name changed first.


#11

I really like this suggestion for the syntax a[:,sum].
If a[:,sum] drops the dimension, then it seems to fall naturally out of the APL-style indexing rules that a[:,[sum]] should keep the dimension (and a[:,[sum,median]] should yield an m×2 matrix with a column of sums and a column of medians, though this is probably very tricky to implement efficiently for arbitrary reductions).


#12

Yeah, it’s a cute idea but the implementation is a challenge. It’s also not clear to me that we’d want to add this kind of meaning to indexing — it’d be very different than everything else we currently support.


#13

Reading back through that thread I found https://github.com/JuliaLang/julia/issues/16606#issuecomment-398086850.

I had forgotten about that idea and I still like it — except maybe the keyword shouldn’t be squeeze, it should be dropdims.


#14

Thanks to whoever split this off, and apologies for derailing the previous thread.

The examples passing sum as a function seem strange to me, compared to sum(log, A) and reduce(+, A, dims=2) where the given function acts on elements, or pairs, not the array. Perhaps A[:,+] would be a more consistent version.

Right now sum(A, dropdims=1) seems very appealing, and clear – visually a small modification of sum(A, dims=1).


#15

I still think the higher order function approach, i.e. squeeze(sum, A, dims=2) is most appealing.