"Optional" in-place output arguments

Philippe_Maincon1 · June 9, 2022, 11:39am

I am considering writing a function

foo!(i1,i2,o1,o2,o3)

where i1 etc. are inputs and o1 etc. are outputs. Importantly: all the outputs are preallocated (mutable) vectors, and the results are to written in place into these vectors.

I would like to make it possible to call foo! to request only some of the ouputs, but without clutering the code of foo with either multiple methods, or if..thens, but with full performance gain from not doing unnecessary computations. Did someone say “we are greedy”?

What I could do is create a new datatype NotWanted with no content. For this type I implement SetIndex! to be an inline no-op function. I would then pass an object of this type as argument when the results are… not wanted.

What I wonder is, can the compiler (and what must I do to help it) cut out from the method instance, all the operations in foo! made unprofitable by the data being thrown away?

function foo!(i1,i2,o1,o2,o3)
o1[:] = 2.*i2
tmp = 3*i1          # data going nowhere, please do not compile
o2[:] .= tmp .+ o1  #  "
o3 = ...
return
end

o1 = Vector{Float64}(undef,2)
o2 = NotWanted()
o3 = ...
foo(rand(2),rand(2),o1,o2,o3
@show o1,o3

halleysfifthinc · June 9, 2022, 4:00pm

Couple things:

You mention being aware of the option of defining multiple methods of foo!. That would be my first choice to solve problems like this, and is the easiest and most direct way to get the compiler to do what you are wanting. IMO, your idea involving creating new types (e.g. NotWanted) would be more “cluttered” (and less idiomatic) than having multiple methods.
Depending on the complexity/cost of the calculations to produce o1/o2/o3, the cost of an if statement might be negligible compared to actually running (or not) the calculations.
(Minor nit) Convention for mutating functions is to have the mutated arguments first, e.g. foo!(o1,o2,o3,i1,i2)

I would choose between multiple methods vs if statements based on how you will be using the function. If you know at the call site which outputs you will need, then multiple methods makes the most sense; if the number of outputs is data dependent, then if statements make more sense.

Philippe_Maincon1 · June 10, 2022, 6:34am

Your point 3. is in order: I have systematically been doing the opposite!

Your remarks 1. and 2. are valid, but I did no make my motivation clear:
There will be many foo! functions, to be written by innocent users of my little package. I am doing twists and flips to simplify their work of writing the foo!-s.

The performance loss of if...then is quite acceptable, but the time and code clutter I impose on the foo! programmer to write multiple methods or to analyse what code can be if-ed out, is not (is my premise when asking my question).

maxkapur · June 10, 2022, 7:03am

How’s this?

function foo!(
	o1::Union{Vector{Int},Nothing},
	o2::Union{Vector{Int},Nothing},
	o3::Union{Vector{Int},Nothing},
	i1,
	i2
)
	isnothing(o1) || fill!(o1, i1)
	isnothing(o2) || fill!(o2, i2)
	isnothing(o3) || fill!(o3, i1 + i2)
end

o1 = zeros(Int, 20)
o3 = zeros(Int, 20)

foo!(o1, nothing, o3, 4, 2)

@show o1
@show o3
# o1 = [4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4]
# o3 = [6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6]

Philippe_Maincon1 · June 10, 2022, 7:31am

In the solution you propose, you are explicitly testing isnothing. In my tiny example, this is really not an issue, but in a “real” function, with complicated computations, having to explicitly test “let me see, what ouputs are wanted, do I need to compute this intermediate result” complicates the code.

I want to find out if there is any way to have the compiler make these decisions when a method-instance is compiled (when the types of o1... are known.

Tough question, I know. I am quite prepared to hear that, no, lazy guy, you will have to do that yourself .

maxkapur · June 10, 2022, 8:42am

I’m not sure if I understand your question correctly, but it sounds like you are looking for something along the lines of

foo!(o1::Vector, o2::Vector, o3::Vector, i1, i2) = ...
foo!(o1::Vector, o2::Vector, o3::Nothing, i1, i2) = ...
foo!(o1::Vector, o2::Nothing, o3::Vector, i1, i2) = ...
foo!(o1::Vector, o2::Nothing, o3::Nothing, i1, i2) = ...
...

and so on for all 8 possible combinations. For n outputs, the number of combinations to consider is 2^n, so this will be impractical to code by hand. You might be able to use metaprogramming to generate all the methods; I haven’t delved into this topic yet myself.

Alternatively, I think the following code will compile each method on the fly, but it still requires the || thing in the function body.

function foo!(o1::T1, o2::T2, o3::T3, i1, i2) where {T1 <: Union{Vector,Nothing}, T2 <: Union{Vector,Nothing}, T3 <: Union{Vector,Nothing}}
    ...
end

Again, I am not sure if I have understood your question correctly, and I am not sure what kind of performance cost/benefit this approach would have.

Philippe_Maincon1 · June 10, 2022, 9:02am

One background to the question is the following: say I insist (lazy!) an write a single method. When the method is called with a specific set of input variables (Array or NotWanted or nothing), then an instance is compiled by Julia. An instance is machine code defined by method+type of all inputs.

An then I want Julias LLVM to be smart and for each instance it compiles (part of the 2^n set you mention), eliminate as much as possible from the machine code. This requires the compiler to do the analysis work (this intermediate result has no effect on any output) that I want not to do.

Metaprogramming takes some getting used to, but you are right, this could be used to generate “2^n” methods. In my case n=4, so it\s not crazy. But this would require me to program the logic (of identifying an cutting out dead code) that I hope/wish is built into the Julia compiler. And here I throw the sponge: this is too hard for me!

maxkapur · June 10, 2022, 9:50am

Yep, it sounds like metaprogramming is what you need, and I suspect that if you do it right you can get it to generate methods on the fly rather than dumping all 2^n methods at once.

I hope that someone more experience than me can chime in and provide a sketch of how this might work. I haven’t learned metaprogramming before because I never encountered a case where it felt necessary, but this may be just the thing.

Sukera · June 10, 2022, 10:09am

This really is something that is solved by having your users write more methods, with the selection of which method will be chosen laden on dispatch, steered by the number of arguments you’re calling the user-made function with.

I’m not sure I follow why having your users write more functions will lead to your package having to deal with more code paths that you don’t already have to deal with anyway? Could you elaborate on this and how you expect to use your user provided functions in your package?

It feels a little bit like you want to use R-like detection of the number of “requested output arguments” or Fortran-like “which arguments are provided and which outputs are enabled via flags” to steer how your function behaves, which just isn’t how julia works/is intended to be used here. There’s almost certainly a more julian solution for this.

Philippe_Maincon1 · June 10, 2022, 10:18am

Again, the context of my question is: I want to minimize the work I impose on the “user” that is writing foo!. And you are right, I need to explain. I am writing a special FEM software. Each type of finite element will have its foo!, and want colleagues, students, and the world at large to contribute with new element types, so I want to make it easy to contribute. That’s why I do not want to bother my users with multiple methods (and I have worked with Julia long enough to be very enthusiastic about multiple dispatch) or ifs.

But then I want the FEM software to be able to request different things from an element. Sometimes the solver will want the residuals for all fields, sometimes only for some fields, and I don’t want to CPU-pay for what I am not getting.

lrnv · June 10, 2022, 10:29am

I think what you want is doable by creating a type NotWanted With the right type bounds and the right methods.

For example, if you can guaranty that all not wanted outputs are supposed to be arrays, you may create a NotWanted subtype of abstract array that has a setfield function that does literally nothing, and hope the compiler will suppress dead branches.

however, I doubt the compiler is smart enough to remove all dead code, you might have to help him by telling him that things might be dead.

In this case, a macro that reads the function inputted by your user and fills it with if typeof(x) is NotWanted… Statements where there are needed is probably the easier way to get what you want. Then your users only have to define there functions with @smart fun(x)... end, which is a common idiom in Julia : the issue is that you will have to learn meta-programation and manipulate the abstract syntax tree to remove all calls that are ‘dead’, which is going to be a long journey… You will basically make a DSL.

Philippe_Maincon1 · June 10, 2022, 10:42am

Hi,

Precisely, I was looking at setfield. I wondered how far NotWanted would take me, and as you say: part of the way.

I would take a smart guy to write @smart - if the guys who set up LLVM did not do it, I’ll find myself some lesser challenges to tackle.

Sukera · June 10, 2022, 11:04am

To me at least it sounds like forcing your users to write ifs or teach them to query their arguments for what they should output is more confusing than having them write different methods/functions for different requests from the solver. Why not make it explicit for both your users and someone writing a solver what they request, by having that method be named differently? Forcing everything into a single function will lead to harder to maintain & develop code down the road. We have the freedom to give names to functions to make it clear what they’re doing after all, so making use of that here seems appropriate.

I don’t think this is necessary and will lead to this being harder to maintain/extend in the future, on top of having to go for metaprogramming when that’s absolutely not required here. That macro would more or less just define a new function for each branch - why not give that branch an actually appropriate name by implementing the function directly?

A macro can’t do a magic branch elimination based on how a function would be called from a solver. That’s just not what it does. All a macro can do is transform some valid, parseable syntax into some other syntax. Nowhere does any code elimination come into play, unless you write the macro to do that based on syntax alone (which it can’t, when it depends on how the result of the macro would be called).

Philippe_Maincon1 · June 10, 2022, 11:21am

Guys, thank you for the input. I plan to make a few tests with a simple NotWanted, look at @lcode_lowered, and report here (give me plenty time). There is one important option that remains relevant: let foo! compute everything anyway, and chill about CPU time!

Sukera - I generally agree with you, but on two points:

The various outputs are computed based on some common intermediate results, so separating foo! into multiple function would result in code and computation duplication. (I tried to simplify my question. In doing so, I withhold relevant information…)
The metaprogramming idea (which I will not pursue!) would be to generate methods with more specialised input foo!(...,o1::NotWanted,...) and code pruned accordingly. I think a macro can do that. In principle!!!

Again, thank you all!

lmiq · June 10, 2022, 12:21pm

I think what you need is to allow the users to passfoo as an argument and possible parameters using closures, such that your inner code doesn’t need to deal with those possible (infinite) variations.

And perhaps define different types of output to be returned by the user provided function, to be handled differently by your package.

halleysfifthinc · June 10, 2022, 5:27pm

I will second this. I now understand the reasons you wish to avoid if statement or multi-method approaches, but I think a better long-term solution would be to spend more time on the design of your API to try and avoid the need for if statements, multi-methods, or a metaprogramming/DCE approach. From your description, it sound’s like you are building something that is or has similarities to user specified callbacks. I recommend taking a look through the SciML ecosystem docs and code, as I understand they heavily incorporate the use of callbacks.

Philippe_Maincon1 · June 13, 2022, 4:51am

I will look into closures indeed!

So if an output argument is not part of the remaining arguments of the closure, it is not wanted (that is how I read your suggestion).

Not that my purpose is not to simplify the work of the caller (I program the the calling code and I’ll be OK), the the code of foo!.

The usual question remains - how smart will the compiler be at eliding foo!s code within the various method-instances. One way to find out: I’ll test.

Philippe_Maincon1 · June 13, 2022, 5:19am

I will have to think very hard on how callbacks can be key to what I want to do.

Thanks!

Philippe_Maincon1 · June 13, 2022, 6:35am

So I did some testing:

struct NotWanted end 
function setindex!(A::NotWanted,X,inds...) end 

function foo!(o1,o2,o3,i1,i2)
    o1[:] = 2*i1
    o2[:] = 3(i1.+i2)
    tmp   = i1.+i2
    o3[:] = 3tmp
end

o1 = Vector{Float64}(undef,2)
o2 = NotWanted()
o3 = NotWanted()
i1 = randn(2)
i2 = randn(2)
@code_lowered foo!(o1,o2,o3,i1,i2)

yields

CodeInfo(
1 ─ %1 = 2 * i1
│        Base.setindex!(o1, %1, Main.:(:))
│   %3 = Base.broadcasted(Main.:+, i1, i2)
│   %4 = Base.materialize(%3)
│   %5 = 3 * %4
│        Base.setindex!(o2, %5, Main.:(:))
│   %7 = Base.broadcasted(Main.:+, i1, i2)
│        tmp = Base.materialize(%7)
│   %9 = 3 * tmp
│        Base.setindex!(o3, %9, Main.:(:))
└──      return %9
)

which implies: making setindex! a no-op does just that, and only that: upstream computations are not elided, wether they are coded on the same line (o2) or on previous lines.

I also created a closure

close!(o1,i1,i2) = foo!(o1,o2,i1,i2)
@code_lowered close!(o1,i1,i2)

yielding the straightforward

CodeInfo(
1 ─ %1 = Main.foo!(o1, Main.o2, i1, i2)
└──      return %1
)

So if Main.o2==NotWanted() I am back to the previous test.

In short, I have been indulging in some wishful thinking. Redesign (involving closures?) is nedeed, or some form of concession (programming foo! will require some more work.

Thanks everyone!

DNF · June 13, 2022, 7:34am

Beyond the question of whether or not your request is technically possible, I suspect that this will make your code more confusing to your users, and not easier.

For one, you are coming up with some unusual patterns that seem convoluted and perplexing; and, secondly, you are ‘tricking’ the reader, who expects foo!(x, ...) to actually mutate x.

Making things easier by making them more complex and unintuitive might not be the optimal strategy. How are you planning to explain your interface to inexperienced users, when seasoned Julians struggle to follow your reasoning?

Topic		Replies	Views
What's the aliasing story in Julia General Usage question , aliasing	36	2851	August 9, 2023
Compiler optimization for variables and functions Performance	16	1416	September 5, 2018
Performance issue due to function as an argument General Usage question , performance	16	844	September 22, 2023
How the compiler decides which allocation is eliminated? Internals & Design	27	660	December 14, 2024
My wishlist for the next version of Julia General Usage question , matlab	64	6553	October 19, 2021

"Optional" in-place output arguments

Related topics