When does the Julia compiler step in

I understand that Julia is a compiled language and that it only generates code (with the help of LLVM) and the CPU is what executes it.

My question is about when/how exactly does Julia step in while the code is executing. To motivate the discussion I provide two code snippets below. One which is type stable and another which is not.

# type_stable.jl

foo(x) = x * 2

v = Vector{Int64}([1, 2, 3])

s = 0
for e in v
    global s += foo(e)

# type_unstable.jl

foo(x) = x * 2

v = Vector{Real}([1, 2, 3])

s = 0
for e in v
    global s += foo(e)


For type_stable.jl is everything compiled in one go? Or does Julia wait to compile foo until code execution reaches that point by putting a callback to itself to perform the compilation and then execute (execute meaning jump to where the callback just put the generated code)? The possible branches being whether or not it is the first iteration of the loop.

For type_unstable.jl is everything compiled in one go (i.e all possible foos are generated and which one to execute is decided by an if statement on the type of e in the generated code)?

Or again does Julia wait to compile foo by putting a call back to itself in the generated code to generate code for foo and then execute it based on what type it finds for e (jump to where it just generated the code)? The possible branches being it has seen the type of e before in which case it looks for the address of the code it needs to jump to, or it hasn’t seen the type of e before in which case it makes the call back to generate the code for that type and then executes it.

Or another likely option is that none of this makes sense and I have no idea what I’m talking about :slight_smile:

In both cases, foo is compiled at its first call, since all of this code is in global scope - no earlier.

The difference between type stable & unstable code and when the compiler “steps in” really only comes apparent in very type unstable code & local scope, like this:

foo(x::Int) = 1
foo(x::String) = 2.0

function bar()
   s = 0
   for _ in 1:10
     s += rand() < 0.5 ? foo(1) : foo("bar")

If we take a look at type stability using @code_warntype:

julia> @code_warntype bar()
MethodInstance for bar()
  from bar()
     @ Main REPL[1]:1
  @_2::Union{Nothing, Tuple{Int64, Int64}}
  s::Union{Float64, Int64}
  @_4::Union{Float64, Int64}
Body::Union{Float64, Int64}
1 ─       (s = 0)
│   %2  = (1:10)::Core.Const(1:10)
│         (@_2 = Base.iterate(%2))
│   %4  = (@_2::Core.Const((1, 1)) === nothing)::Core.Const(false)
│   %5  = Base.not_int(%4)::Core.Const(true)
└──       goto #7 if not %5
2 ┄ %7  = @_2::Tuple{Int64, Int64}
│         Core.getfield(%7, 1)
│   %9  = Core.getfield(%7, 2)::Int64
│   %10 = s::Union{Float64, Int64}
│   %11 = Main.rand()::Float64
│   %12 = (%11 < 0.5)::Bool
└──       goto #4 if not %12
3 ─       (@_4 = Main.foo(1))
└──       goto #5
4 ─       (@_4 = Main.foo("bar"))
5 ┄ %17 = @_4::Union{Float64, Int64}
│         (s = %10 + %17)
│         (@_2 = Base.iterate(%2, %9))
│   %20 = (@_2 === nothing)::Bool
│   %21 = Base.not_int(%20)::Bool
└──       goto #7 if not %21
6 ─       goto #2
7 ┄       return s

We can see a lot of Union{Float64,Int}, colored in read (which doesn’t show up in discourse…). This happens because the compiler can only see that either foo returns an Int64 or a Float64, depending on a runtime value returned by rand(). Since that value is then also returned and the return type depends on runtime information (not just argument types), we say that bar is “type unstable”.

Note that foo itself is perfectly stable!

julia> @code_warntype foo(1)
MethodInstance for foo(::Int64)
  from foo(x::Int64)
     @ Main REPL[3]:1
1 ─     return 1

julia> @code_warntype foo("bar")
MethodInstance for foo(::String)
  from foo(x::String)
     @ Main REPL[6]:1
1 ─     return 2.0

This follows from the same principle - each method of foo we’re calling here is type stable, because the output type is uniquely determined by the input types of their arguments.

Back to your question - “when does the julia compiler step in”. In this case, the compiler steps in first when we call bar(). Purely semantically speaking it doesn’t compile either foo until it is actually called and a hook to the compiler is inserted instead, to compile the method in question when it’s needed, since it’s not clear whether either foo will ever be called. Practically speaking, it’s more complicated - for simple Unions like the one in my example here, the compiler can and often does already compile both ahead of time and just inserts a runtime switch (so called “dynamic dispatch”) to the already compiled appropriate method.

This doesn’t (and can’t) always happen though - if something is so unstable to have to fall back to Any, the compiler has no choice but to insert a possible compilation of a new method, if a method that could take Any is available there. Another case where this happens is if the Union is too big (I think it was 5-6 elements), or only an abstract type is inferred.


Practically speaking, it’s more complicated - for simple Union s like the one in my example here, the compiler can and often does already compile both ahead of time and just inserts a runtime switch (so called “dynamic dispatch”) to the already compiled appropriate method.

So sometimes when compiling a method, methods called inside that method will also be compiled right then? And if its unstable, multiple method versions may be compiled for the methods inside if there are few enough types to consider? But in general none of this is guaranteed? Is that the right understanding to take?

What about when the code is type stable and a method inside is surely to be called. Why not just compile that too while you are at it and put the appropriate address to jump to (do it recursively)? Basically go as far as you can until you hit a type instability or a point at which you are not sure whether or not the function will even be called? Or is that just not how it works, and the semantics is to compile lazily?

Separately, if the code in the for loop is type stable and the function foo has been generated already, is there a lookup to find where the generated code for foo is or whether or not it exists? Or do we just jump straight to the code?

In the case of unstable I imagine there has to be such a lookup for what code to jump to based on type of input. Something like lookup the type in a table. It returns a memory address or null. If memory address jump to it. If null call back to compiler to generate code. Call to compiler returns a memory address to where it put the code. Jump to it and upon return insert that into the table.

For the stable version I imagine an if statement checking if its the first iteration. If so lookup the type in the table. It returns memory address or null (potentially compiled by a different part of the code). If null call back to compiler to generate code. Call to compiler returns a memory address to where it put the code. Store that locally and in the table. Jump to it. In future iterations just jump straight to the address you stored locally without performing lookups.

Does that seem to capture the general gist at a conceptual level?

That is exactly what happens in completely type stable code, yes. Your original type stable example just didn’t have a top level call to go into - it was all in global scope. The first function call was foo, so it only gets compiled then.

Yes and Yes. You can do methods(f) to query the methods of a function, and one of those methods has a field specializations to check which specializations exist for that method. While you can manually call these, it’s not necessary - the compiler will already insert the appropriate call to the code (or inline it entirely, based on some heuristics).

Conceptually, yes, though it’s not implemented that way.

Again, this lookup only happens only because you’re in global scope in the first place. In global scope, you could in theory have some other code in a background task eval new methods or overwrite an existing method, which would require that new foo to be called in your globally scoped loop.

In regards to inlining & already compiling called functions in local scope (i.e. inside a function), take these for example:

f(x) = x^3
g(x) = 2*f(x)
h(x) = g(x)/4

If we just look at @code_warntype, we still see the call:

julia> @code_warntype h(1)
MethodInstance for h(::Int64)
  from h(x)
     @ Main REPL[3]:1
1 ─ %1 = Main.g(x)::Int64
│   %2 = (%1 / 4)::Float64
└──      return %2

But that’s only because @code_warntype is a very high level view. If we already take a look at the generated LLVM IR, we can see that both g and f got inlined:

julia> @code_llvm h(1)
;  @ REPL[3]:1 within `h`
define double @julia_h_359(i64 signext %0) #0 {
; ┌ @ REPL[2]:1 within `g`
; │┌ @ REPL[1]:1 within `f`
; ││┌ @ intfuncs.jl:320 within `literal_pow`
; │││┌ @ operators.jl:580 within `*` @ int.jl:88
      %1 = shl i64 %0, 1
      %2 = mul i64 %1, %0
; │└└└
; │┌ @ int.jl:88 within `*`
    %3 = mul i64 %2, %0
; └└
; ┌ @ int.jl:97 within `/`
; │┌ @ float.jl:269 within `float`
; ││┌ @ float.jl:243 within `AbstractFloat`
; │││┌ @ float.jl:146 within `Float64`
      %4 = sitofp i64 %3 to double
; │└└└
; │ @ int.jl:97 within `/` @ float.jl:386
   %5 = fmul double %4, 2.500000e-01
; └
  ret double %5

The inlining is of course not guaranteed, but it generally is reliable that type stable code does get compiled all the way through. That’s why it’s important to keep code type stable.

Of course, some operations are inherently type unstable, like parsing. Colloquially speaking, “parsing” is always a map of some String or Vector{UInt8} to some type T that’s more specific than Any. Since the parsing can return any of the subtypes of Any though, it’s an inherently type unstable operation that will fall back to Any in the worst case (or some abstract type in a better case).

I should note that eval does not affect function calls in local scope - the results of an eval are only “visible” to existing functions once they return to global scope. The mechanism this is tracked by is called “world age”, which is a sort of advanced topic that’s not really relevant for regular development. Suffice it to say that it’s the core mechanism that allows julia to compile type stable calls to native machine code, while still staying semantically dynamic.


Is it correct to say then that type unstable code is what can force the lookup of what code to execute during run time (either with a switch statement or some kind of table in the generated code), but type stable code will either inline the function or the address to jump to? Additionally type unstable code may have to hand control back to the compiler during run time whereas type stable code just compiles once ahead of time and then executes all the way through? And these are the reasons why type stable code is so much more performant than type unstable code? (Just talking about local scope in this post, I didn’t appreciate that distinction when I originally posted)

1 Like

Yes, all that is precisely correct! :slight_smile:

What may be seen as even worse is that type unstable code has the nasty ability to propagate outwards - if the return of one function is unstable and you use the result of that function in your code as a return value as well or pass it to other functions, the compiler can’t magically get rid of that (well it can in some cases - but it’s best to think of that as an optimization, not as a guarantee).