What's the difference between `PackageCompiler.create_sysimage()` and `Pkg.precompile()`?

bertulli · September 21, 2022, 4:43pm

Hi all!

Pretty much the title question. I’d like to try Julia for my Master’s thesis (it’s my first time with it), and I too was annoyed by the initial package loading time, especially if I want to quickly try some code.

Googling around, it seems like PackageCompiler’s sysimages are the way to go. However, in a video webinar by Julia Computing it is used Pkg.precompile().

What’s the difference between the two? Is one preferable to the other for some reason?

Oscar_Smith · September 21, 2022, 4:49pm

precompile caches type inferred code, while create_sysimage caches machine code. The difference is that system images remove roughly 100% of loading time, but are not relocatable between machines, while precompile files are relocatable, but only remove some of the startup time.

kristoffer.carlsson · September 21, 2022, 4:52pm

Precompile files have roughly the same issues with relocatability as sysimages in the sense that any path that gets stored during precompilation will likely be invalid on another machine.

bertulli · September 21, 2022, 5:06pm

Thanks to both!

@Oscar_Smith would you mind expanding what did you mean here?

When you say “type inferred code”, do you mean the code expanded to reflect the actual types used in your script, like C++ templates do? Or is it something else?

Oscar_Smith · September 21, 2022, 5:32pm

kind of. For a basic example, consider the code

julia> function fib(x)
    x < 1 && return 1
    return fib(x-1) + fib(x-2)
end

When a user writes fib(10) the function gets compiled for an Int input type. The first step here is parsing and lowering, which converts the code to

julia> @code_lowered fib(10)
CodeInfo(
1 ─ %1 = x < 1
└──      goto #3 if not %1
2 ─      return 1
3 ─ %4 = x - 1
│   %5 = Main.fib(%4)
│   %6 = x - 2
│   %7 = Main.fib(%6)
│   %8 = %5 + %7
└──      return %8
)

The main things that change in this step are that the code is changed to single assignment (SSA) form. Next comes type inference which produces

julia> @code_typed optimize=false fib(10)
CodeInfo(
1 ─ %1 = (x < 1)::Bool
└──      goto #3 if not %1
2 ─      return 1
3 ─ %4 = (x - 1)::Int64
│   %5 = Main.fib(%4)::Int64
│   %6 = (x - 2)::Int64
│   %7 = Main.fib(%6)::Int64
│   %8 = (%5 + %7)::Int64
└──      return %8
) => Int64

This is the result of Julia doing an abstract interpretation of the code and tracking what possibilities of types there are. After this, some optimizations like inlining are applied to this typed code.

CodeInfo(
1 ─ %1  = Base.slt_int(x, 1)::Bool
└──       goto #3 if not %1
2 ─       return 1
3 ─ %4  = Base.sub_int(x, 1)::Int64
│   %5  = Base.slt_int(%4, 1)::Bool
└──       goto #5 if not %5
4 ─       goto #6
5 ─ %8  = Base.sub_int(%4, 1)::Int64
│   %9  = invoke Main.fib(%8::Int64)::Int64
│   %10 = Base.sub_int(%4, 2)::Int64
│   %11 = invoke Main.fib(%10::Int64)::Int64
│   %12 = Base.add_int(%9, %11)::Int64
└──       goto #6
6 ┄ %14 = φ (#4 => 1, #5 => %12)::Int64
│   %15 = Base.sub_int(x, 2)::Int64
│   %16 = Base.slt_int(%15, 1)::Bool
└──       goto #8 if not %16
7 ─       goto #9
8 ─ %19 = Base.sub_int(%15, 1)::Int64
│   %20 = invoke Main.fib(%19::Int64)::Int64
│   %21 = Base.sub_int(%15, 2)::Int64
│   %22 = invoke Main.fib(%21::Int64)::Int64
│   %23 = Base.add_int(%20, %22)::Int64
└──       goto #9
9 ┄ %25 = φ (#7 => 1, #8 => %23)::Int64
│   %26 = Base.add_int(%14, %25)::Int64
└──       return %26
) => Int64

Here, the main changes are that simple functions get removed and replaced by their definition. This is the level that precompilation stores. After this, the typed code gets passed to LLVM (a C/C++ compiler) which produces LLVM IR

julia> @code_llvm debuginfo=:none fib(10)
define i64 @julia_fib_514(i64 signext %0) #0 {
top:
  %1 = icmp sgt i64 %0, 0
  br i1 %1, label %L4, label %common.ret

common.ret:                                       ; preds = %L4, %top
  %common.ret.op = phi i64 [ %6, %L4 ], [ 1, %top ]
  ret i64 %common.ret.op

L4:                                               ; preds = %top
  %2 = add nsw i64 %0, -1
  %3 = call i64 @julia_fib_514(i64 signext %2) #0
  %4 = add nsw i64 %0, -2
  %5 = call i64 @julia_fib_514(i64 signext %4) #0
  %6 = add i64 %5, %3
  br label %common.ret
}

and then turns into native code (which is what sysimages store)

julia> @code_native debuginfo=:none fib(10)
	.text
	.file	"fib"
	.globl	julia_fib_538                   # -- Begin function julia_fib_538
	.p2align	4, 0x90
	.type	julia_fib_538,@function
julia_fib_538:                          # @julia_fib_538
	.cfi_startproc
# %bb.0:                                # %top
	testq	%rdi, %rdi
	jle	.LBB0_1
# %bb.3:                                # %L4
	pushq	%r15
	.cfi_def_cfa_offset 16
	pushq	%r14
	.cfi_def_cfa_offset 24
	pushq	%rbx
	.cfi_def_cfa_offset 32
	.cfi_offset %rbx, -32
	.cfi_offset %r14, -24
	.cfi_offset %r15, -16
	movq	%rdi, %rbx
	decq	%rdi
	movabsq	$julia_fib_538, %r15
	callq	*%r15
	movq	%rax, %r14
	addq	$-2, %rbx
	movq	%rbx, %rdi
	callq	*%r15
	addq	%r14, %rax
	popq	%rbx
	.cfi_def_cfa_offset 24
	popq	%r14
	.cfi_def_cfa_offset 16
	popq	%r15
	.cfi_def_cfa_offset 8
	.cfi_restore %rbx
	.cfi_restore %r14
	.cfi_restore %r15
	retq
.LBB0_1:
	movl	$1, %eax
	retq
.Lfunc_end0:
	.size	julia_fib_538, .Lfunc_end0-julia_fib_538
	.cfi_endproc
                                        # -- End function
	.section	".note.GNU-stack","",@progbits

bertulli · September 21, 2022, 7:25pm

Your answer is a treasure, thank you!
Fun fact: my thesis is going to be on LLVM, so I think I can learn something from Julia’s compiler.

Oscar_Smith · September 21, 2022, 7:43pm

Ooh, yay! We always appreciate having more people in the community with LLVM knowledge.

tfiers · October 9, 2022, 6:12am

[edit: posted as standalone question]

Topic		Replies	Views
Why would a sysimage be faster than a package (JuliaScript.jl)? New to Julia package-compiler , sysimage	4	238	July 13, 2024
PackageCompiler: Do optimisation flags influence the produced sysimages? Performance package-compiler , sysimage	0	410	September 26, 2020
Why can't some functions (CSV.File()) be precompiled? General Usage precompilation , package-compiler	0	236	December 19, 2022
Sharing precompiled code across docker images Performance docker , package-compiler	5	1112	August 30, 2024
Question about `PackageCompiler.create_sysimage` parameters General Usage package-compiler , sysimage	4	2760	February 11, 2020

What's the difference between `PackageCompiler.create_sysimage()` and `Pkg.precompile()`?

Related topics