How to compile a portable binary (at least across macs) with `juliac.jl`

Cool news! Thanks to @simonbyrne, I’ve managed to compile a julia program on one mac, ship it to another mac that’s never even heard of julia, and certainly doesn’t have it installed, and it runs!! (And it’s a 10 year old computer!) Just wanted to share we’ve learned.

I did this using juliac.jl from PackageCompiler, but the below thoughts should apply to anyone trying to run julia with a different cpu target.


Here’s the things I learned:

1. code is compiled for a cpu target

When julia is invoked, all the code it compiles is compiled for a specific cpu target. By default, I believe it uses the " most specific" cpu target it can – presumably to get the best performance possible. On my machine at least, julia calls this the “native” cpu target. This default target, native is what is used when building its default sysimg (sys.dylib).

You can specify a different cpu target to use via the -C or --cpu-target command line option. The default is -Cnative. If you wanted to broaden the supported cpu architectures to, say, all Intel 64-bit cpus, you could invoke julia as julia -Cx86-64.

2. :open_mouth: ERROR: Target architecture mismatch. Please delete or regenerate sys.{so,dll,dylib}. :open_mouth:

However, if you just change the target architecture, things don’t work. You’ll get the above error, complaining that the machine code its emitting and its sysimg (sys.dylib) weren’t built for the same architecture. But fret not! We can simply ask julia to ignore the sysimg. From julia -h:

 --precompiled={yes|no}    Use precompiled code from system image if available

3. :smiley: --precompiled=no --compilecache=no :smiley:

Cool, so now, this works! (but it’s slower of course)

$ julia -Cx86-64 --precompiled=no
               _
   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: https://docs.julialang.org
   _ _   _| |_  __ _   |  Type "?help" for help.
  | | | | | | |/ _  |  |
  | | |_| | | | (_| |  |  Version 0.6.2 (2017-12-13 18:08 UTC)
 _/ |\__'_|_|_|\__'_|  |  Official http://julialang.org/ release
|__/                   |  x86_64-apple-darwin14.5.0

julia> println("Hooray! it works!")
Hooray! it works!

julia> # cool, let's do some stuff

julia> using UnicodePlots
INFO: Precompiling module UnicodePlots.
ERROR: Julia and the system image were compiled for different architectures.
Please delete or regenerate sys.{so,dll,dylib}.
ERROR: write: broken pipe (EPIPE)
Stacktrace:
 [1] try_yieldto(::Base.##296#297{Task}, ::Task) at ./event.jl:189
 [2] wait() at ./event.jl:234
 [3] uv_write(::Base.PipeEndpoint, ::Ptr{UInt8}, ::UInt64) at ./stream.jl:811
 [4] unsafe_write(::Base.PipeEndpoint, ::Ptr{UInt8}, ::UInt64) at ./stream.jl:832
 [5] unsafe_write(::Base.PipeEndpoint, ::Base.RefValue{UInt8}, ::Int64) at ./io.jl:293
 [6] write(::Base.PipeEndpoint, ::UInt8) at ./stream.jl:873
 [7] write_as_tag(::Pipe, ::Int32) at ./serialize.jl:128
 [8] serialize(::SerializationState{Pipe}, ::DataType) at ./serialize.jl:542
 [9] serialize(::SerializationState{Pipe}, ::Expr) at ./serialize.jl:330
 [10] create_expr_cache(::String, ::String, ::Array{Any,1}) at ./loading.jl:633
 [11] compilecache(::String) at ./loading.jl:709
 [12] _require(::Symbol) at ./loading.jl:497
 [13] require(::Symbol) at ./loading.jl:405

— wait what!? So, --precompiled=no is enough to stop using the sysimg, but now we have the problem that the precompiled packages we’re importing were precompiled for the wrong target. So let’s turn off reading from the cached, precompiled packages as well. Right under that last flag in julia -h is this one:

 --compilecache={yes|no}   Enable/disable incremental precompilation of modules

Turning compilecache off will stop importing precompiled package images.

**** So putting those together, the correct way to invoke julia with a custom cpu target is as follows: ****

$ julia -Cx86-64 --precompiled=no --compilecache=no

Cool! So why did all this come up in the first place? Ah yes, I was trying to compile a distributable binary using juliac.jl. So it turns out juliac.jl already has a flag to specify the cpu target (also -C or --cpu-target), which just gets forwarded along to julia when it’s invoked. So I just had to change juliac to also set --precompiled=no --compilecache=no when the user sets -C!


So, assuming that PR goes through, or some version of it, you should just be able to pass -C<target> or --cpu-target=<target> to juliac in order to compile for a different target.

If, as in my case, you want your compiled binary to support any intel-based mac, you would invoke it like this!:

julia ~/.julia/v0.6/PackageCompiler/juliac.jl -vaej  --cpu-target=x86-64 examples/hello.jl

and it will fill examples/builddir with everything you need to run this binary on another computer:

  1. the binary itself, hello
  2. the binary’s “sysimg”: hello.dylib
  3. all the julia libs it needs to link against: libjulia.dylib, libLLVM.dylib, libamd.dylib, etc…

(And it will also have the temporary file hello.o in there. You can delete that.)

And now you can simply zip up builddir and send it to another mac computer, and they can open it and run ./hello and

hello, world
sin(0.0) = 0.0
      ┌────────────────────────────────────────┐
    1 │⠀⠀⠀⠀⠀⠀⠀⡠⠊⠉⠉⠉⠢⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
      │⠀⠀⠀⠀⠀⢠⠎⠀⠀⠀⠀⠀⠀⠘⢆⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
      │⠀⠀⠀⠀⢠⠃⠀⠀⠀⠀⠀⠀⠀⠀⠀⠳⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
      │⠀⠀⠀⢠⠃⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠱⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
      │⠀⠀⢠⠃⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠳⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
      │⠀⢀⠇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢣⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
      │⠀⡎⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
      │⠼⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠬⢦⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⢤│
      │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⡆⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⠇│
      │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠘⡄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡎⠀│
      │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠱⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡞⠀⠀│
      │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠱⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡜⠀⠀⠀│
      │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠱⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡞⠀⠀⠀⠀│
      │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠘⢆⠀⠀⠀⠀⠀⠀⢠⠎⠀⠀⠀⠀⠀│
   -1 │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠑⢄⣀⣀⣀⠔⠁⠀⠀⠀⠀⠀⠀│
      └────────────────────────────────────────┘
      0                                      100

voila

25 Likes

Thanks for the writeup. Now the question is: what’s the oldest computer someone can get this to run on?

2 Likes

@NHDaly thanks for the hint on using --precompiled=no and --compilecache=no with --cpu-target, and glad to see you find juliac.jl useful!
I have just merged your PR, in the next days I will also make those two options directly available from juliac.jl, together with a few others that I am adding. Stay tuned! :grinning:

4 Likes

:laughing: hooray! Yes, it is immensely cool. Really, just the baseline ability for julia to emit a compiled object-file is immensely cool by itself, but juliac is a very slick packaging. :smile:

Glad to help work on it!!

This is very cool. Does it mean that the “first hit” problem is completely mitigated away?

@NHDaly I now found out that on my system (Linux) I do not need to set both --precompiled=no and --compilecache=no when using --cpu-target, while I need to do either one of the following:

  • set --compilecache=no (even without --precompiled=no)

or:

  • comment out the following lines in static_julia.jl:
command = `$julia_cmd -e $expr`
verbose && println("Populate \".ji\" local cache:\n  $command")
run(command)

So the issue seems to be localized in the local modules precompilation, and I wonder if we actually hit a Julia bug. Can you please verify the two points above in your system?

So, i think other people (like github/SimonDanisch?) could probably answer this better, but I think in general: yes, ahead-of-time compiling your julia scripts should completely mitigate away the “first hit” problem!

This post was about compiling a program into an executable (or a shared object for linking into a c++/python program). If you are interested in removing the first hit latency from a julia package, there’s PackageCompiler.jl, which is the same concept but specifically for compiling packages! :smiley: (juliac.jl and PackageCompiler share a lot of the same code.)

Hmm, no unfortunately this doesn’t seem to work on my system:

$ julia -Ccore2 --compilecache=no
ERROR: Julia and the system image were compiled for different architectures.
Please delete or regenerate sys.{so,dll,dylib}.

Oh yeah, you’re right! It totally worked. Cool. I did as you suggested (without my PR), and it did in fact compile and work on the other computer.
(5f31f7e is the commit right before my PR that added the --precompiled=no --compilecache=no flags.)

$ rm -rf builddir ;  git checkout 5f31f7e1b7c7b2aca  &&  sed -i.bak '207,209s/^/#/' src/static_julia.jl  &&  julia juliac.jl -vaej -Cx86-64  examples/hello.jl

Hmm, yeah you might be right, because I don’t think it has to do with the availability of those .ji files, it’s simply building them where it fails.

If you build them ahead of time (on the wrong architecture) it doesn’t complain when it you change the architecture but skip that step:

$ git checkout -- . && rm -rf builddir  &&  git checkout 5f31f7e1b7c7b2aca  &&  julia juliac.jl -vaej examples/hello.jl  && sed -i.bak '207,209s/^/#/' src/static_julia.jl  &&  julia juliac.jl -vaej -Cx86-64  examples/hello.jl

So maybe it is a bug? I’m not familiar enough with these flags to say.

Yes you are right, also on my system with -Ccore2 I need both --precompiled=no and --compilecache=no.

@jameson I ask you because time ago you added the following line to juliac.jl:

run(command) # first populate the .ji cache (when JULIA_HOME is defined)

which then became the following lines in static_julia.jl (part of PackageCompiler.jl):

command = `$julia_cmd -e $expr`
verbose && println("Build module image files \".ji\" in directory \"$builddir\":\n  $command")
run(command)

You may remember when we talked about that commit here.

I wonder if we should actually do both of these:

  1. remove the lines above from static_julia.jl (so do not create a local .ji cache)

and

  1. always pass --precompiled=no --compilecache=no to julia when compiling

On my system I find negligible time difference (if any) when compiling hello.jl with local .ji cache or without it and passing --precompiled=no --compilecache=no, so I wonder if the local .ji cache is actually of any use.

and

  1. always pass --precompiled=no --compilecache=no to julia when compiling

On my system I find negligible time difference (if any) when compiling hello.jl with local .ji cache or without it and passing --precompiled=no --compilecache=no, so I wonder if the local .ji cache is actually of any use.

I don’t think this is a good idea… on my system there is consistently a very significant time difference when turning off precompilation and compile caches, with or without passing -C:

 19:34:03 $ time julia -Ccore2 --precompiled=no --compilecache=no -E "2 + 2"
4
julia -Ccore2 --precompiled=no --compilecache=no -E "2 + 2"  13.02s user 0.24s system 101% cpu 13.128 total
 19:34:18 $ time julia -E "2 + 2"
4
julia -E "2 + 2"  1.63s user 0.18s system 115% cpu 1.569 total
 19:35:36 $ time julia --precompiled=no --compilecache=no -E "2 + 2"
4
julia --precompiled=no --compilecache=no -E "2 + 2"  13.29s user 0.28s system 100% cpu 13.543 total
 19:35:53 $ time julia -E "2 + 2"
4
julia -E "2 + 2"  1.68s user 0.18s system 114% cpu 1.618 total

I don’t know much about your question #1, but it seems to me like it’s a useful optimization to not have to rebuild all the packages each time you invoke juliac.jl, unless they’ve changed.

Have you tried compiling hello.jl using juliac.jl, first as it is, and then commenting out the lines above in static_julia.jl and passing --precompiled=no --compilecache=no?

I thought all this was fixed in v0.6 if the build knows it’s going to be redistributed (for example, the buildbots set some extra flags to enable it), so there’s really no point in following any of these steps anymore.

But anyways, I think the best way to handle it (assuming you didn’t just configure the JULIA_CPU_TARGET build target configuration in Make.user in the first place) is by doing the following steps to first rebuild Julia:

./julia -J usr/lib/julia/sys.so --output-ji sys.ji -e nothing # extract .ji image
./julia -J sys.ji --output-o newsys.o -C new-target -e nothing # regenerate new `.o` file for new target machine
ld newsys.o -shared -o newsys.so -ljulia

Then you should be able to use this on any machine that supports “new-target” as the base image (which should also propagate itself to recursive targets):
./julia -J newsys.so

:hushed: oh? Could you elaborate on this? In the post, I was using v0.6.2. Is there some other, v0.6 way to tell Julia that the build is going to be redistributed?

Why is this preferable? It seems like an unnecessary step to have to rebuild the julia sysimg and then have to compile the sysimg with the user’s code. Can you elaborate on what the difference is? Thanks for commenting here! :slight_smile:

(relatedly then, do you think we should reevaluate the changes we made in the juliac PR?)