Trying to understand optimization via multiple dispatch and type stability

(Word of warning, I’m not well-versed in programming in general and have only used Python for simple tasks, so it would be helpful for me if even simple concepts are explained with some detail.)

I’m interested in trying Julia for its dynamic typing and high performance. I’ve read multiple articles explaining that this is done via multiple dispatch and type stability. In simple terms, we can write a function with unspecified types like in Python, but Julia will compile a method for each combination of argument types upon the first call. If Julia can expect a specific return type for a method, it makes something like a C function where all input and output types are specified. Therefore, Julia can be easy to write like Python and can have the efficiency of C.
However, these articles don’t go beyond that, and I do have questions about how this works:

  1. Say I write a function that has an if statement to check an argument’s type (bear with me, I know it’s easier to just write separate methods) and do different things depending on it. Am I correct in thinking that Julia is smart enough to make a method for each condition?

  2. If a function call within the main function is not type-stable, am I correct in thinking that Julia must incorporate type checking after that function call and select from different compiled subroutines? If so, how does Julia implement this? Does it update the method as unexpected types show up?

  3. I read that type-unstable functions have a performance cost due to the increased need to type-check. I can see that adding up to a large problem if it is called in other functions and forces THOSE functions to type-check, or if it is heavily repeated as in a for-loop, but is it that big of a deal if it is only intended to be called manually?

1 Like

Just a warning before I start, this will contain a mix of what can be done and what the current implementation does. Almost none of what I’m going to say is guaranteed behavior since they all only affect performance. You should treat most of what I said as current implementation detail. They should help you understand how things are done but the detail can change between versions without warning.

While this is kind of true it is unrelated to your type check. Julia doesn’t care if you have a branch that checks type. It’ll specialize the function for each input type indepedent of that. Your type checking condition will likely just be constant folded in each specialization.
In another word, julia will make specializations for each condition (and possibly more) but that’s not because you write the condition.

If you mean whether the “main” function has to have multiple versions to expect different result returned from this type unstable function, then no, it doesn’t have to. If you just means that code that uses the type unstable result have to include runtime type check then yes.

Not really sure what you mean here… It looks at the type and pick the correct method and specialization, which is what runtime dispatch means… Anymore more detail than this are basically C functions that looks at all sort of properties of types.
Or if you mean if julia has to do something very special to deal with this then no. Julia is perfectly capable of running without knowing the type information, much like how python can work at all. Type inference and specialization are only optimization.

What do you mean by “method” here? In your context, there’s a main function, there’s the type unstable function call, and there are function calls following the unstable one that’s in the “main” function. Which one(s) are you talking about?
Also, what do you mean by “unexpected”. If a result that doesn’t agree with type inference shows up then that’s an inference bug and julia will just crash.

What do you mean by calling manually? FWIW, on one hand, the computer will never call a function out of the blue, it is always caused by some “manual” action and on the other hand, it’s always code that calls functions.

Judging by what you said before I assume you mean code in REPL/global scope? (or annything that’s not called in a function or a loop?) Well, none of what you said is significant. The only thing significant is how many times the unstable result is used. You can write a hot loop in global scope and the performace will absolutely matters (and you should avoid it). You can also write a function that takes multiple miliseconds and run it a few million times in a loop and the performance won’t matter as much just because how much time is spent elsewhere.

FWIW, this is always the answer you’ll get whenever you ask about if it’s a big deal for performance. It’ll always depend on the exact situation.

3 Likes

(My response is numbered similarly to how I did in the original post)

  1. Yes, that is what I meant, thank you for clarifying. The if statement in that context is entirely redundant.

2a. Okay, there really is one method per combination of argument types, just that it will include more type-checking as needed. BTW by method I mean the specialized function by argument types, like when you enter “+” into the REPL it tells you that it’s a “generic function with 163 methods”.

2b. I asked that question assuming Julia must update the method somehow every time the unstable function returns a different type, but I’m getting the sense that Julia doesn’t HAVE to know the type of every variable, it can check the unstable type of a variable at runtime and select the proper methods for the function calls that take the variable as an argument.

2c. I made that assumption because I couldn’t imagine how Julia knows if a function is type stable or not in advance. I essentially assumed that Julia assumed every function is type stable until otherwise proven in a future main function call.
This is something I still don’t quite understand: when compiling a method, how does Julia know when it needs to type check a type-unstable function call or when it can optimize with a type-stable function call?
Can Julia determine if a function is type-stable? I doubt this is the case because I can define a function F that calls a function G without G ever having been defined; if inner function calls are checked for type stability at all, I would expect that glaring omission to raise an error.

  1. yeah I meant in the REPL or global scope. Thanks for the tips and perspective.

method is what you define. Specialization is what the compiler generates. You can have a function f with a single method f(a, b) = a + b but when you call it with f(1, 2) and f(1.0, 2.0) the compiler will likely generate two specializations for it. Whenever you said “method” above, you were almost always talking about specialization and not method in “generic function with 163 methods”.

It looks at the code of the function and infer the type, that’s what type inference is doing. You can look at the result using code_typed, check the doc for it for more details/tools.

Well, julia cannot know the concrete answer all the time. Knowing the output type of a function in all cases is undecidable. However, it is still possible to compute an upper bound of the type. If julia cannot find the definition of a function and it believes it can be supplied later, it can just assume the return type can be anything. There’s no case where inference have to throw an error so you can never expect the absense of an error to tell you anything about inference (other than you didn’t hit a bug).
Also, inference is run after the input types are known. Simply defining a function without using it anywhere usually gives no hint to the compiler what to infer from. Of course if you defined a function that can only accept one set of concrete type input then the compiler could infer that method and generate specialization for it. However, the compiler usually don’t do extra work unless it has to so this is not done except when it makes sense to be aggressive about compilation (e.g. when precompiling packages/generating system image).

1 Like

This was in yuyichao’s last answer, but buried and I think where your confusion is. Method specialization happens the first time a function is called with a new set of types. Thus for example in your 2c last case, if G isn’t defined when the method is called, you have a name error. Thus the question is never

Is this function type stable?

but rather

What is the return type of this function with these input types?

Thus for the code

function f(a)
    return g(a)
end

when you first call f with a new type, it will see what type g returns when fed that type, and then will know that f returns that type. Note that this process can also happen before f is called (eg if h needs to know what f will return), and can happen multiple times (eg if g gets a new method).

1 Like

These were very useful clarifications and information, thank you!