Julia Type system manual confusions

It can’t be the first, since all possible types are not known at compile time. Consider the code snippet

function f(x, y)
    x + sin(y)
end
a1 = f(0, 1)
a2 = f(a1, 1.0)

The function f contains two function calls, a call to sin and a call to +. When the compiler encounters the call f(0, 1), it looks up the method table for f to see if it can be called with two Ints. Indeed, there’s a method where both x and y can be anything, i.e. Any. The compiler fetches this definition, finds the the function calls inside, and checks whether sin can be called with an Int:

julia> methods(sin)
...
 [10] sin(x::Real)

Sure there is. It looks inside. The first thing that happens there is xf = float(x), then the call sin(xf). It checks if float can be called with an Int. Sure. It will return a Float64. It checks if sin can be called with a Float64. Sure, it will return a Float64. It now knows that the call sin(y) inside f will return a Float64.

Then there’s the + call inside f. It will be called with x which is an Int, and a Float64. It checks if + can be called with an Int and a Float64, sure there is:

julia> methods(+, Tuple{Int,Float64})
# 1 method for generic function "+" from Base:
 [1] +(x::Number, y::Number)

It looks inside, and figures out that here is another float(x) inside there somewhere. The + call with an Int and a Float64 will convert the x to a Float64 and call Core.Intrinsics.add_float, with two Float64, which returns a Float64.

So, f(0, 1) will return a Float64. It now compiles f, including + and sin, for two Int inputs, knowing the types of everything that goes on inside. And it knows that a1 will be a Float64.

Now, the compiler finds the f(a1, 1.0). It now goes through the same steps for f, but now f is called with two Float64s. It compiles f for these inputs also (The + and sin inside have both been compiled for Float64 when f(::Int, ::Int) was compiled). The results of the compilation is saved in the function definition of f, and can be reused when further calls are encountered:

julia> mt = methods(f)
# 1 method for generic function "f" from Main:
 [1] f(x, y)

julia> m = mt[1]
f(x, y)

julia> m.specializations
svec(MethodInstance for f(::Int64, ::Int64), MethodInstance for f(::Float64, ::Float64), nothing, nothing, nothing, nothing, nothing)

So, the next time the compiler encounters a call to f, it checks if there is already a MethodInstance for the argument types inside the specializations list.

Now, this is all fine in this example. The type of everything can be inferred from the type of the input arguments. Sometimes this is not the case, e.g. here, where put things inside the function h

g(x) = x < 0 ? 0 : x
function h(x,y)
    a1 = f(x, y)
    a2 = f(a1, 1.0)
    a3 = g(a2)
    a3 + a2
end
h(0, 1)

Now, as before the compiler knows that a2 will be a Float64, but the compiler doesn’t get to see the value of a2, only the type. It can’t figure out what type g will return. It’s either 0, which is an Int, or x, which is a Float64. When it subsequently encounters a3 + a2, it does not know what types this + will be called with, it’s either an Int and a Float64, or two Float64s. It must insert code for handling both cases. I.e. it must do “dynamic dispatch” at runtime, which slows down things considerably. (Though, if there are only two cases like here it has a quick way to handle this, called “union splitting”). And in this case it figures out that h(0,1) anyway returns a Float64:

julia> @code_warntype h(0,1)
MethodInstance for h(::Int64, ::Int64)
  from h(x, y) @ Main REPL[74]:1
Arguments
  #self#::Core.Const(Main.h)
  x::Int64
  y::Int64
Locals
  a3::Union{Float64, Int64}        # The type of a3 is not fully inferred
  a2::Float64
  a1::Float64
Body::Float64
...

We can make an even worse example:

g(x) = x < 0 ? x < -1 ? x < -2 ? Int8(0) : Int16(0) : 0 : x
function h(x,y)
           a1 = f(x, y)
           a2 = f(a1, 1.0)
           a3 = g(a2)
           a3 + a3
end
julia> @code_warntype h(0,1)
MethodInstance for h(::Int64, ::Int64)
  from h(x, y) @ Main REPL[122]:1
Arguments
  #self#::Core.Const(Main.h)
  x::Int64
  y::Int64
Locals
  a3::Union{Float64, Int16, Int64, Int8}
  a2::Float64
  a1::Float64
Body::Any

Now, the return type of g is one of four different types. Thus a3 can have any of them, it is too many for doing union splitting for the a3 + a3, and nothing is inferred about the return type of h(0,1), i.e. it’s inferred as Any.

The short answer to your question is that the function will be compiled when it’s actually called, when the types of the arguments are known. But often the compiler can do this in advance because it can infer the types of the arguments, like with the sin and + inside f above.

2 Likes