[201908] Compiling a Data Science Image for Julia on Windows to combat compilation latency

I wanted to share some steps on how to compile a Data Science Image for Julia.

1. Remove existing installation of PackageCompiler.jl

Firstly, if you have PackageCompiler.jl previously, make you remove them entirely

  • In Julia ]rm PackageCompiler
  • Manually remove the PackageCompiler folder from C:\Users\UserName\.julia\dev\PackageCompiler
  • Manually remove the PackageCompiler folder from C:\Users\UserName\.julia\packages\PackageCompiler
  • Manually remove C:\Users\RTX2080.julia\compiled\v1.1\PackageCompiler
  • Close down Julia

2. Install PackageCompiler

use the sd-notomls branch of PackageCompiler

]up; add PackageCompiler#sd-notomls

3. Install all Data Science packages!

]add GeometryTypes PooledArrays Arrow Tracker ScikitLearnBase 
add IteratorInterfaceExtensions SparseArrays VisualRegressionTests DelimitedFiles
add Images Dates LaTeXStrings ImageMagick DataStructures BinaryProvider 
add CodecZlib DataValues Statistics UnicodePlots Distances CategoricalArrays 
add RDatasets Random Test FileIO LinearAlgebra
add Tables TableTraits WeakRefStrings FlatBuffers Parameters

using PackageCompiler
pkgs = [
  :Plots
  ,:GR
  ,:StatsPlots
  ,:DataFrames
  ,:DataFramesMeta
  ,:StatsBase
  ,:GLM  
  ,:Clustering
  ,:Flux
  ,:Lazy
  ,:Feather
  ,:DecisionTree
  ,:CSV
    #,:Query
  #,:TableReader
]

@time sysnew, sysold = PackageCompiler.compile_incremental(pkgs..., install = true)

4. replace the sys.dll with the compiled one

Now go to C:\Users\USERNAME\.julia\packages\PackageCompiler and there should a sub-folder with in some random string code e.g. sTrwT, go inside in to find the sysimg folder.

Now copy the sys.dll to C:\Users\USERNAME\AppData\Local\Julia-1.1.1\lib\julia. Take care to back up the sys.dll to be overwritten if you wish.

5. Use it

Close down Julia and open it again and now you should notice that the packages load instantly!

Notes

On my machine it took about 20mins to compile all the packages. Even after compiling there is a still a first to plot problem because Plots.jl’s test doesn’t have many plot(...) tests and PackageCompiler just compiles the tests inside it. Looking for tips on how to overcome this! Thanks!

5 Likes

Is the #notomls branch still necessary? If so, is there any way for people to agree on what needs to be done for us to merge stuff we need onto master/tag on a release? @arnavsood @sdanisch?

1 Like

From my testing, that branch is necessary, or else it will fail.

Seems to be working on v1.2 as well. Did some editing.

Edit:
Actually, the compile DLL doesn’t work. When I start Julia with the dll it crashes right away.

I don’t think the sd-notomls branch is necessary for the reason we needed it before (grabbing test deps, so that we could manually fudge things.) And I can get things to AOT-compile without using
it by using master.

If you have a look at (for ex.) some of his recent NextJournal images, they also aren’t using it… so dunno.

2 Likes

If I use master, this is the errro I get. But may be I shouldn’t use install=true?

[ Info: used 3677 out of 3679 precompile statements
ERROR: MethodError: no method matching compile_incremental(::String, ::String; install=true)
Closest candidates are:
  compile_incremental(::Union{Nothing, String}, ::String; force, verbose, debug, cc_flags) at C:\Users\RTX2080\.julia\packages\PackageCompiler\h11nw\src\incremental.jl:134 got unsupported keyword argument "install"
Stacktrace:
 [1] kwerr(::NamedTuple{(:install,),Tuple{Bool}}, ::Function, ::String, ::String) at .\error.jl:125
 [2] (::getfield(PackageCompiler, Symbol("#kw##compile_incremental")))(::NamedTuple{(:install,),Tuple{Bool}}, ::typeof(compile_incremental), ::String, ::String) at .\none:0
 [3] #compile_incremental#63(::Array{Any,1}, ::Base.Iterators.Pairs{Symbol,Bool,Tuple{Symbol},NamedTuple{(:install,),Tuple{Bool}}}, ::Function, ::Symbol, ::Symbol, ::Vararg{Symbol,N} where N) at C:\Users\RTX2080\.julia\packages\PackageCompiler\h11nw\src\incremental.jl:177
 [4] (::getfield(PackageCompiler, Symbol("#kw##compile_incremental")))(::NamedTuple{(:install,),Tuple{Bool}}, ::typeof(compile_incremental), ::Symbol, ::Symbol, ::Symbol, ::Vararg{Symbol,N} where N) at .\none:0
 [5] top-level scope at util.jl:156
1 Like

you linked to old images (for julia 1.0). The newer images (julia 1.2) use #notomls. See a recent nextjournal post here. Sorry for the missing link. I am on my phone.

EDIT:
recent discourse post: Julia 1.2 in Nextjournal
current images: https://nextjournal.com/julia

2 Likes