Multiple dispatch in mathematical writing

Hey,

I have been coding almost exclusively in Julia for a while, and i noticed a strange habit in my mathematical writing that was not there before: I start to use overloading and multiple dispatches everywhere. My reviewers disagreed, and therefore the question tickled me.

Suppose I define a function f : \mathbb R \to \mathbb R, say f(s) = s^2. Then, for any measure \nu \in \mathcal M_+(\mathbb R), I define f(\nu) = \int f(s) \nu(ds).

Furthermore, for a dataset \mathbf x \in \mathbb R^{N}, i use f(\mathbf x) as representing the estimation of the parameter f(\nu) that is done from this dataset.

Then, quite naturally, I define a quadratic loss \lVert f(\mathbf x) - f(\nu) \rVert_2^2, and the optimization goal is to find a measure \nu that would minimize this lossâ€¦

What do you think about the notation f ? Is it ambiguous ? According to Juliaâ€™s dispatch, it is not. Do you do such things when writing math ?

Edit: my example is only overloading, but same thing happends with dispatch : f(x,y) means a completely different function than the ones aboveâ€¦

5 Likes

Itâ€™s quite common to overload math notation like this but I think that mathematicians may not realize how overloaded their notation actually is. In your example, it might be more explicit to have a functor that lifts f from \mathbb R to \mathcal M_+(\mathbb R) but itâ€™s also quite common to just use the same name. I think it depends on how much itâ€™s important to think about the functor application as an explicit operation.

9 Likes

Thereâ€™s a pretty common function called â€ś+â€ť that is overloaded like crazy in mathematical writingâ€¦

17 Likes

Indeed, very good point. I need to build up my case

1 Like

Right, + is always a great example. Sometimes people will distinguish between + in different objects or categoriesâ€”often with a subscript or superscriptâ€”but usually they just write + everywhere and let which + they want to use be implied by the context of what kinds of objects are being added. Multiple dispatch isnâ€™t the only way to implement that kind of overloading in a programming language, but it certainly does seem to be a good fit, so I think your approach is well justified.

3 Likes

This convention is common in Physics. Consider a field variable \phi which could be written as a function of Cartesian coordinates, i.e. \phi(x,y,z), or spherical coordinates, i.e. \phi(r,\varphi,\theta).

2 Likes

In math overloading if often called abuse of notation. Like in programming this can drastically simplify or obscure things.

6 Likes

Iâ€™m a professional math guy and have considrable experience in editorial work. Overloading like this can confuse readers and referees (bad) and get your paper needlessly rejected. I think your particular example is reasonable, but I can understand a refereeâ€™s becoming annoyed at your quadratic loss definition and the different meanings for the symbol f . Telling the referee that itâ€™s multiple dispatch will not make things go better, nor is saying that itâ€™s just like +.

I like it when my papers and proposals are accepted and avoid doing things like this. @jw3126 has it right.

6 Likes

Thanks to be so direct, that is exactly the core of my question.

I think you have to take it on a case-by-case basis. If the application of f to different kinds of arguments is clearly just a generalization of the same conceptual thing, then used judiciously itâ€™s reasonable and even commonplace.

For example, no one has a problem with the same name exp used for the exponential of both scalars and linear operators, from real numbers to matrices to differential operators on functions.

On the other hand, in the original example f(s) = s^2 (nonlinear) when s is real but f(Î˝) is a linear operator when Î˝ is a measure, then I would think of these as two totally different functions (even if the latter involves f(s)) and it seems confusing to give them the same name.

8 Likes

I think this is pragmatic advice, but why is + different from f? I genuinely want to know if thereâ€™s a reason for these to be treated differently or if itâ€™s just traditional that + is allowed to be so heavily overloaded.

1 Like

History and common usage. + is overloaded in non-mathematical English, unary operations are clear to most people, and @stevengj 's example of exp is standard and all math people would get it. f means what you say it does in your paper and defining to be two different things is poor form. Thereâ€™s a difference between things like exp, +, \Sigma, which are well understood and notation you invent yourself.

I think the bottom line is that itâ€™s fine to use commonly overloaded things and not fine to make up overloaded symbols yourself. Most referees will get the message when you use exp(A) no matter what A is and still have every right to be unhappy with several definitions of f.

4 Likes

This would seem to rule out matrix functions, which are a standard idea in which any analytic function f: \mathbb{C} \to \mathbb{C} can be generalized to act on square matrices (or many other linear operators), and people just write f(z) and f(A) for an arbitrary â€śuser-definedâ€ť f.

1 Like

I think a crucial difference is itâ€™s completely fine to use the same notation for generalizing one notion, in a way that is backward compatible. For instance, matrix functions are defined in a (Banach) algebra, of which numbers and linear operators are instances; similarly + is usually an instance of the group operation of an abelian group. Itâ€™s different from eg using psi(x) and psi(k) for a function and its Fourier transform (most physics people will think itâ€™s an eminently reasonable shortcut for simplifying complicated expressions, most math people will never talk to you again). The OP example of f(Î˝) IMO falls into the second category, itâ€™s better to use a different notation than () for both (and I would argue in julia code as well). An interesting case is notation that could be plausibly interpreted with two meanings, eg M >= 0 for a matrix (is it in the sense of SPD matrices or componentwise)?

8 Likes

Thatâ€™s pretty standard too and very traditional. Nobody would have problems with that unless you decided to change the definition of a standard function. egâ€¦

Let exp(x) = x^2

1 Like

Ooff. This is one of my least favorite examples. Itâ€™s one thing when the various behaviors of a function can be determined by looking at the input values (like + for reals vs vectors). But here itâ€™s just the variable names that distinguish them, once you provide input values, you have no idea whatâ€™s going on.

This is probably my very least favorite one. Why not just put a hat on the psi? What does psi(0) mean here?!

3 Likes

Then you disambiguate by writing psi(k=0). As a card-carrying mathematician I was as appalled as you at first, but I have to say it creates a lot less issues than what youâ€™d think, and now I find myself using this kind of shortcuts when writing on the board or in notes (not in papers). In math thereâ€™s a lot of emphasis on the fact that mathematical writing should be compilable, but that sometimes leads to register spill issues where youâ€™re forced to use symbols that donâ€™t really match their use, or even out-of-memory errors (the â€śoh no Iâ€™m out of latin and greek, Iâ€™ll now use hebrewâ€ť syndrome). In physics since youâ€™re using the same name for everything, you can just use a limited number of registers that have a clear purpose, and never run out of them.

Another quirk of the â€śnames donâ€™t matter as long as itâ€™s correctâ€ť motto is that you sometimes find definitions and formulas that are physically strange. For instance, the definition of Fourier transforms of distributions usually goes by extending the first formula in Fourier transform - Wikipedia, which is just weird (is x real space or reciprocal space?). (This particular one is just a bad choice of definition imo, much better to introduce conjugates and define it through good old parseval.)

2 Likes

Itâ€™s been around for a long timeâ€¦ certainly confusing though, I agree.

Well, itâ€™s not very nice when you start using scaled and shifted coordinates inside the functions.

But most of all, I think it reduces clarity, and makes it really hard to keep track of what is going on, so I think this is even worse for educational purposes than for publishing.

2 Likes

The issue is that f, \tilde{f}, \hat{f}, \tilde{\hat{f}} and \hat{\tilde{f}} all meaning different things make things hard to read too

There were a lot of interesting arguments here on both sides of the argument. I am trying to find a cleverer notation than the one I had, disambiguating these things.

I still think that multiple dispatch should apply in math papers, but this is a fight I am not willing to take myselfâ€¦

1 Like