Sum + squeeze as one function

Is there a built-in version of sum(A, dims) that automatically squeezes the summed out dimension? Something like:

sumsqueeze(A, dim) = squeeze(sum(A, dim), dim)

Surely, I can just combine 2 functions, but in my use case I generate this code and having it as a single built-in function makes it much easier to optimize it later.

What’s wrong with the definition you’re using right now?

1 Like

As I’ve told, I’d prefer a single built-in function to make further optimizations easier. Just to give you a sense of what I’m doing:

  1. In one of my packages I do a lot of code generation. Generated code may be executed elsewhere, so it’s better to rely only on built-in features rather than introduce my own sumsqueeze.
  2. Each function call is parsed into a single expression, so B = squeeze(sum(A, dim), dim) is actually represented as:
tmp1 = sum(A, dim)
B = squeeze(tmp1)

which is harder to match to existing optimization patterns. Still doable, but given amount of discussion around sum/squeeze on GitHub I wanted to make sure there’s no simpler solution before going this way.

There’s no difference between a “built-in” function and defining sumsqueeze(A, dim) = squeeze(sum(A, dim), dim) yourself, except for the module in which the function is defined.

2 Likes

The difference is that sum and squeeze are always there while sumsqueeze would require additional dependency. Here’s a use case:

  1. I create a module A that generates expression ex.
  2. This expression is sent to another worker / saved to a file / manually copy-pasted to another module B.

If ex includes sumsqueeze, B should import it and thus it depends on A. If only built-in functions are used, no dependency is introduced.

It is probably a better idea to not have sumsqueeze defined in A, but rather a third module C which B depends on.

It’s somewhat unusual in Julia to be doing code generation where the generated code is used via copy and paste. Julia provides a very powerful macro facility for metaprogramming, in which generated expressions can be directly inserted into code at parse time

One of the many advantages of using macros for code generation is that if a module A defines a macro @foo, then the code generated by @foo can call functions defined only in module A, even if @foo is called by someone else’s code or module.

Relevant issue so you can understand why reducers don’t already squeeze:

https://github.com/JuliaLang/julia/issues/16606

In short, you often want to keep those extra dimensions, but not always. Since it’s easier to get rid of them than to try to put them back after they’re gone, we do it that way. It’s a tradeoff. One potential pattern for addressing this would be to define this squeeze method:

squeeze(f, A, dims) = squeeze(f(A, dims), dims)

But getting the type signature right might be a little tricky.

and I guess he’s just asking if that C could be/is Base :slight_smile:

It may be the case that squeeze(f, A, dims) is sufficiently simple, such that there is no need to have it in Base, but I think it is that case that a lot of people hit this “problem” now and again.

What is the type signature problem you’re referring to @StefanKarpinski ?

I was thinking that squeeze was variadic and since leading function arguments are a little tricky to type sufficiently loosely (because non-functions can be callable), that would be a problem. But squeeze isn’t variadic – it takes a scalar or a tuple for dims – so that’s not an issue.

Hi, is there any update on this?

I packaged a trivial workaround that I find useful:

https://github.com/tpapp/Squeezing.jl

Basically

julia> @squeezing sum(ones(3, 2), 1)
2-element Array{Float64,1}:
 3.0
 3.0
1 Like