I will try to explain how it works because this is a good opportunity to see how well I understand it myself.
Bottom line: Julia compiles a native version of a function the first time it is run with a certain set of argument types (without creating any build artifacts).
Now let’s get into more details…
The core devs like to call Julia a “Just ahead of time” compiler. In contrast to AOT (ahead of time) compilers (e.g. C or C++) which compile a static binary upfront and classical JIT (just in time) compilers which usually start by interpreting and tracing your program and then compiling hot spots to native code behind the scene.
Julia works more like an AOT compiler in that sense because it does not do any tracing but compiles (almost) everything. Just not in a separate compilation stage before runtime.
The following things happen when you pass your code to the Julia compiler either by executing a script or typing it into the REPL (I will be glossing over details such as parsing and lowering because they are not interesting in the scope of this discussion):
- Julia runs type inference on your code to generate typed code.
- The typed code gets compiled to LLVM IR (Intermediate Representation).
- The IR gets handed over to LLVM which generates fast native code.
- The native code gets executed.
One of the beautiful things about Julia is that this is not a black box and you can observe all steps of the process if you wish to.
Let’s use the following simple function as an example and type it into the REPL:
julia> function add(a, b)
return a + b
end
add (generic function with 1 method)
If I want to see the results of type inference, I can use the @code_typed
macro:
julia> @code_typed add(1, 1)
CodeInfo(
1 ─ %1 = Base.add_int(a, b)::Int64
└── return %1
) => Int64
I have used two 64-bit integers as arguments and type inference has determined that the return type will be an Int64
as well.
julia> @code_typed add(1, 1.0)
CodeInfo(
1 ─ %1 = Base.sitofp(Float64, a)::Float64
│ %2 = Base.add_float(%1, b)::Float64
└── return %2
) => Float64
If one argument is a Float64
the return type will be Float64
as well.
To see what happens when this gets compiled to LLVM IR we can use @code_llvm
:
julia> @code_llvm add(1, 1)
; @ REPL[2]:1 within `add'
define i64 @julia_add_303(i64 signext %0, i64 signext %1) {
top:
; @ REPL[2]:2 within `add'
; ┌ @ int.jl:87 within `+'
%2 = add i64 %1, %0
; └
ret i64 %2
}
And the resulting native code can be shown with @code_native
:
julia> @code_native add(1, 1)
.section __TEXT,__text,regular,pure_instructions
; ┌ @ REPL[2]:2 within `add'
; │┌ @ int.jl:87 within `+'
leaq (%rdi,%rsi), %rax
; │└
retq
nopw %cs:(%rax,%rax)
; └
We can also observe the compiler at work by timing the execution of the function with @time
.
# `a` is random vector of Float64
julia> a = randn(1000);
julia> typeof(a)
Vector{Float64} (alias for Array{Float64, 1})
julia> function mysum(v)
return sum(v)
end
mysum (generic function with 1 method)
julia> @time mysum(a)
0.029869 seconds (80.40 k allocations: 4.757 MiB,
99.92% compilation time)
-8.915810948177993
The first time we run mysum
with a
as an argument, i.e. mysum(v::Vector{Float64})
, the function gets compiled and spend 99% of the time compiling.
The second time around the result of the compilation is cached and there is no compilation overhead.
julia> @time mysum(a)
0.000005 seconds (1 allocation: 16 bytes)
-8.915810948177993
Now we run mysum
with an argument of a different type mysum(v::UnitRange{Int64})
and the compiler needs to run again.
julia> b = 1:1000
1:1000
julia> typeof(b)
UnitRange{Int64}
julia> @time mysum(b)
0.005921 seconds (8.62 k allocations: 444.457 KiB,
99.70% compilation time)
500500
But again no compilation overhead on the second run.
julia> @time mysum(b)
0.000004 seconds (1 allocation: 16 bytes)
500500
The results of compilation are cached within the running Julia process which means that no compilation artifacts are written to disk and everything needs to be recompiled once Julia is restarted.
You also mentioned precompiling
which is a different concept in the Julia world and explained in detail in this tutorial: Tutorial on precompilation