Hi!
I have some functions in a loop that get called a huge number of times. Some of them have two different forms, one when my elements are triangles and on when they are quads. Normally, I’d write two different for loops, check if I have quads or tris before entering the for loop, and then call the Tri function in one of them and the Quad in the other one. This would avoid an if condition to check which function to call for every loop iteration.
With multiple dispatch, I can have the same function written twice, once with 3 inputs and once with 4 inputs. No if statement, no two loops.
I haven’t seen a performance penalty from doing that, but maybe I’m not testing enough. Are there no performance disadvantages in doing what I described? Isn’t the code effectively performing an if statement to choose the right function?
Thanks a lot!
This depends on what your data is. If you are dealing with something like a Vector{Quad}
or Vector{Tri}
, than multiple dispatch is zero cost. If you have a Vector{Union{Tri,Quad}}
, then Julia will probably just put an if statement in. If you have a Vector{Shape}
(where Quad
, Tri
are subtypes of shape), then multiple dispatch will probably be slower as it will be doing dispatch at runtime.
I see. Makes sense. My vector is either all quads or all tris, so Julia is figuring it out from there. Great!
Thanks a lot for the answer!
To be clear this is not multiple dispatch problems.
It is also a problem for single dispatch.
Its a dynamic dispatch vs static dispatch problem.
Does it need to resolve the dispatch are runtime, or can it work it out at compile time.
This is incontrast to some other languages with multiple dispatch where using multiple dispatch is strictly slower than single dispatch.
This has historically given multiple dispatch a bad reputation.
But Julia has done the hard thing, and worked out how to make multiple dispatch fast.
That being said, the costs of dynamic dispatch might be higher for multiple dispatch, since single dispatch can use vtables. Fortunately, Julia was designed from the beginning to have semantics that enable “devirtualization” (static dispatch) in typical performance-critical code, whereas this has to be retrofitted onto C++.
You will reach a performance penalty if the number of types increase, which can be dealt with manual splitting. There are some threads discussing that, and solutions, for example: