The verbosity of dropdims…dims…dims (was: range improvement for 1.1)

It would be nice if this could be something like sum(a, dims=1, drop=true) for all functions which take dims.

My understanding is that this would not be type stable.

1 Like

I get that a generic squeeze function can’t be, but why would this be worse than dropdims itself?

Edit: Ah I see now, sorry. A keyword drop=(2,3) vs drop=nothing etc could let the type system know, but the value drop=true can’t. Unless all this new fancy constant propagation business can save us…

Perhaps you missed https://github.com/JuliaLang/julia/pull/28708

I know, and I said before it is indeed better, but still not as nice as linspace was. LinRange is a good substitution though.

It would be nice if this could be something like sum(a, dims=1, drop=true) for all functions which take dims.

That would not be typestable if understood as “drop all singleton dimensions”, but if dropping only the dimensions specified by dims, with constant propagation I think that would be fine?

mbauman suggested sum(a, squeeze=1) (which would now perhaps be spelled sum(a,dropdims=1)) in https://github.com/JuliaLang/julia/issues/16606 , which is a nice idea (although perhaps not very intuitive at first). StefanKarpinski suggested dropdims(sum, a, dims=1) which is nice and composeable, but it does lose something in translation (the main action is that I want to sum that array, not that I want to drop some dimensions, and therefore sum should be the first thing appearing in the expression).

2 Likes

FWIW I would go with something like

sum(A, Drop(2, 3))
sum(A, Squeeze(2, 3))

where Drop and Squeeze wrap a tuple with a convenience splat syntax.

function sumdrop(a; dims, drop=false)
    s = sum(a,dims=dims)
    if drop
        return dropdims(s, dims=dims)
    end
    return s
end

function testsum(a)
    sumdrop(a,dims=2, drop=true)
end

@code_warntype testsum(randn(2,2))

Body::Union{Array{Float64,1}, Array{Float64,2}}
11 1 ── %1  = Main.sumdrop::Core.Compiler.Const(sumdrop, false)                       │
   │    %2  = (Base.sle_int)(1, 1)::Bool                                              │╻╷╷╷╷  #sumdrop
   └───       goto #3 if not %2                                                       ││┃│││   isempty
   2 ── %4  = (Base.sle_int)(1, 0)::Bool                                              │││┃│││   iterate
   └───       goto #4                                                                 ││││┃│     iterate
   3 ──       nothing                                                                 │
   4 ┄─ %7  = φ (#2 => %4, #3 => false)::Bool                                         │││││┃      iterate
   └───       goto #6 if not %7                                                       ││││││
   5 ──       invoke Base.getindex(()::Tuple{}, 1::Int64)                             ││││││
   └───       $(Expr(:unreachable))                                                   ││││││
   6 ──       goto #8                                                                 ││││││
   7 ──       $(Expr(:unreachable))                                                   ││││││
   8 ┄─       goto #9                                                                 │││││
   9 ──       goto #10                                                                │││╻      iterate
   10 ─       goto #11                                                                │││
   11 ─ %16 = invoke Main.:(#sumdrop#21)(2::Int64, true::Bool, %1::Function, _2::Array{Float64,2})::Union{Array{Float64,1}, Array{Float64,2}}
   └───       goto #12                                                                ││
   12 ─       return %16 

constant propagation does not seem to solve the issue in this case

Indeed, the compiler is not clever enough to do that, too bad. Moving the arguments as positional and using @inline does make it infer though.

Thanks for all these. This seems pretty clean, from June I see https://github.com/JuliaLang/julia/issues/16606#issuecomment-398086850

This one could be a package:

I agree, but I don’t have for this ATM, anyone interested should feel free to go for it.

Given that dims is now in the function name, I could get behind a positional argument: dropdims(A, 2) and even dropdims(sum, A, 2).

This is one of those cases where we added the dims keyword when it was still called squeeze… and then realized that dropdims was a better name in the first place. I’m not sure we would have ended up here had the name changed first.

1 Like

I really like this suggestion for the syntax a[:,sum].
If a[:,sum] drops the dimension, then it seems to fall naturally out of the APL-style indexing rules that a[:,[sum]] should keep the dimension (and a[:,[sum,median]] should yield an m×2 matrix with a column of sums and a column of medians, though this is probably very tricky to implement efficiently for arbitrary reductions).

2 Likes

Yeah, it’s a cute idea but the implementation is a challenge. It’s also not clear to me that we’d want to add this kind of meaning to indexing — it’d be very different than everything else we currently support.

Reading back through that thread I found https://github.com/JuliaLang/julia/issues/16606#issuecomment-398086850.

I had forgotten about that idea and I still like it — except maybe the keyword shouldn’t be squeeze, it should be dropdims.

2 Likes

Thanks to whoever split this off, and apologies for derailing the previous thread.

The examples passing sum as a function seem strange to me, compared to sum(log, A) and reduce(+, A, dims=2) where the given function acts on elements, or pairs, not the array. Perhaps A[:,+] would be a more consistent version.

Right now sum(A, dropdims=1) seems very appealing, and clear – visually a small modification of sum(A, dims=1).

I still think the higher order function approach, i.e. squeeze(sum, A, dims=2) is most appealing.

4 Likes

Sorry for warming that up but in the past days I came across this.

Was there any kind of progress or style suggestion regarding that issue?