uhmmm, I feel a little worried that it seems like I have vested myself as an inspector of what should or not be published on the internet (or if something does more damage than it helps), its quite a heavy burden and one that I do not want to carry alone, but I can offer some criticism just do remember I am not your editor.
The quotes come from the article as it is in Mon, 21 Feb 2022 13:18:42 -0300 (an already updated version).
Multiple dispatch is a simple concept: Functions can have multiple definitions as long as each definition accepts a different combination of arguments.
The statement is a bit imprecise. Functions can have multiple definitions as long each definition restricts the type of the parameters differently. The text does not make clear that is the type of the parameters that define which “definition” will be called.
In Julia, the “definitions” are called “methods” but I understand avoiding the term to avoid confusion with the term from object-oriented languages.
The “different” is also a little vague, but I am not sure how much information you want to push into your readers from the start. I say that because between
f(a, b) = true, and
f(a :: Int64, b :: Int64) = false, the first method also includes the case where both are
Int64 and would be called if the second method did not exist, but as the second method exists it is more specific and will be called instead.
In a very high-level perspective, Multiple dispatch may be a simple concept, however, Jeff Bezanson himself (the idealizer of multiple dispatch in Julia) has not much of an idea of how to tackle the difficult problem of saying what really is a more specific set of parameters (the core question that needs to be answered for multiple dispatch to generally work), you can see him talking about this in his 2019 talk “What is bad about Julia”: JuliaCon 2019 | What's Bad About Julia | Jeff Bezanson - YouTube (I have linked the specific time, but if it is lost the relevant discussion starts at 16:25).
To keep things simple going forward, my examples will technically be single dispatch to help readers make sense of the ways Julia leverages dispatch.
I understand that, to reuse the most text possible this is the better solution, but I would reuse the single-dispatch examples to illustrate single-dispatch (say something like “Let us start with single dispatch that is a thing that many modern languages have …”) and then present multiple-dispatch examples after.
Which implementation to call is determined at runtime with a lookup table.
Yes and no. There is a little imprecision here that may scare away some performance-obsessive people, and also changes a little how you should interpret the results in your experiments below. Julia behaviour (in terms of what the code do) will be always indistinguishable from a runtime check at a lookup table. However, in fact, maybe most of the time, what happens is the following: a function
f is called with some arguments and therefore the correct method body is looked up and compiled for these specific argument types, any calls to functions
h that happen inside
f have the correct method body looked up a single time and then are inlined/fixed to the correct method, so the next time
f is called with the same argument types there is no runtime query to the lookup table. But this is a compiler optimization, and Julia is guaranteed that except by the speed improvement, it behaves the same as said lookup. Also, this guarantee of behavior comes from the fact Julia has dynamic dispatch, not multiple dispatch, one could have one and do not have the other (many single-dispatch languages have dynamic dispatch).
To implement multi-type functionality without multiple dispatch, you would either need three different function names and an if-statement tree to decide which one to call, or you would need a single function with if-statements inside that determine which functionality to execute.
But the functionality that you just demonstrated was single-dispatch, which Python have, and therefore do not need to resort to these alternatives, am I wrong? I am not a Python programmer, maybe Python has additional restrictions, but for types that you yourself define (and this way you can wrap primitives) you surely can do single-dispatch in the object-oriented style.
I ran these functions on an array with 3 million elements of mixed types. This dispatched function finished in 0.039 seconds which was 50 times faster than f_py. Furthermore, the dispatched version is twice as fast as “Python-like” Julia. From this example we see how the dispatched version slashed if-statements and ran extremely fast.
More detail about the experiments would be interesting, because Julia is compiled and Python interpreted, so what you thought to be fairer, to include compilation time or not?
multiple dispatch is even faster here is because the correct version of f is determined at runtime with a lookup table, and this avoids millions of if-statement evaluations.
No. Again, multiple dispatch has nothing to do here, you should either say “dispatch” or “dynamic dispatch” because the “multiple” concept has no relevancy here. But, more important than that, the dynamic dispatch when it really occurs dynamically (i.e., it is not optimized away) is much slower than an if-else branch, therefore what happens here has nothing to do with either multiple or dynamic dispatch but with Julia compiler optimizing dynamic dispatch away and therefore removing the need for even an if-else branch (but only after a heavy compiling step that Python does not have).
Multiple dispatch on its own is fundamentally faster
No, it is not, see the comment above. There is nothing fundamentally faster about multiple dispatch, not even about dynamic dispatch (that is the relevant characteristic here). There is a trade-off, in which dynamic dispatch may be faster than if-else if it is optimized away by a heavy compiler step that I do not know if you are including in the times or not.
The Julia compiler is amazing with dispatch. When code is annotated with type information, the compiler translates dispatched functions into extremely efficient LLVM code, optimized for the each specific type.
Thus, when Julia code is properly annotated with type information, the compiler can deduce the optimal construction of each function and produce a huge performance boost.
Yes and no, again, this is more or less what I was referring to above, but again, this does not have to do specifically with multiple dispatch (at least not on your examples that you measured the time), and these optimizations (that are a part of the compiler not the language by itself) are probably the real reason your code is fast. But, also, the “When code is annotated with type information” tidbit is incorrect, type annotations often do not improve performance and are not necessary to obtain the performance you are looking for. See the Performance tips section of the manual:
In many languages with optional type declarations, adding declarations is the principal way to make code run faster. This is not the case in Julia. In Julia, the compiler generally knows the types of all function arguments, local variables, and expressions. However, there are a few specific instances where declarations are helpful.
In fact, the Julia compiler will even remove lines of code for you.
This means that when developers write unnecessary code into Julia, the compiler can eliminate parts of it! Amazing!
Again, I am not sure if this merits this much surprise, I have the feeling many other languages do the same. It gives me the impression that you are not familiar with static language compilers in general.
Notice that the original definition of countlines is the same. This is the beauty of Julia. To extend functionality, you don’t touch your original definitions. You extend functionality and still be sure that the codebase will work before you updated countlines. In this case, you only need to test the new definitions. Also, in my opinion, the Julia version is easier to read.
I do kinda agree but only if Python does not support dynamically adding new methods to existing “classes” (again, I do not know Python enough, fell free to disregard this comment). Otherwise, for single-dispatch examples, the difference is that, in Python,
countlines should have been written as an object method, not an external function, and other “classes” could have been extended with a