Precompile a script?

Is there a way to precompile a script, so that I can execute it in batch fast?

To quote the official docs,

To create an incremental precompiled module file, add __precompile__() at the top of your module file (before the module starts). This will cause it to be automatically compiled the first time it is imported. Alternatively, you can manually call Base.compilecache(modulename).

1 Like

What about a script, not a module?

1 Like

Put the main computational functions (hopefully you are using functions and not one long Matlab-style script full of globals) of the script into a module.

However, it may not give you the hoped for speed-up, see e.g. https://github.com/carlobaldassi/ArgParse.jl/issues/37

Hello,

Just wondering–-why is the issue in ArgParse closed? Argparse still is very slow IMHO. I’ve tried to move my argparsing code in a module that is precompiled and everything, and it still takes over 4 seconds or so to parse three arguments

3 Likes

I don’t know if you’re still having this issue, but I just created a new argument parsing module that might help with that somewhat. GitHub - zachmatson/ArgMacros.jl: Fast, flexible, macro-based, Julia package for parsing command line arguments.
Still in the middle of the three-day waiting period for adding Julia packages to the registry but can be added from GitHub still right now.

1 Like

The stable docs link is broken, just so you know. Not sure if intended.

Yes, it will work once the package is approved into the Julia registry and the first “stable” release gets tagged. Thank you for the heads up though.

For me (and good to know, try -O1, even -O0 sometimes faster for short-running scripts):

$ julia -O0 --startup-file=no
julia> @time using ArgMacros
  0.321002 seconds (1.04 M allocations: 48.524 MiB, 2.98% gc time)

or without -O0 on julia-1.5-DEV, with using my PR where I use -O1:

(@v1.5) pkg> add https://github.com/zachmatson/ArgMacros.jl#5a9b3a5

https://github.com/JuliaLang/julia/issues/35932#issue-620548735

Thanks! I find that ArgMacros.jl is much faster than ArgParse and therefore finally makes it feasible to start using Julia for small scripts. When combining with JSObjectLiteral (I never bothered to register this it seems) you can write

using ArgMacros
using JSObjectLiteral

function parseargs()
    @beginarguments begin
        @argumentrequired String first "-f" "--first"
        @argumentdefault Number π second "-s" "--second"
    end

    return @js { first, second }
end

args = parseargs() ## Dict{String,Any}("second" => π,"first" => "hi")
println(args)
println(@js(args.first), ' ', args["second"])

Actually I see:

$ ~/julia-1.6.0-DEV-8f512f3f6d/bin/julia -O3   # 10 days old master
julia> @time using ArgMacros
  0.021667 seconds (23.77 k allocations: 1.637 MiB)

similar time for any (or no) optimization setting (and that’s not with my PR). It seems we have much to look forward to for Julia 1.6, and maybe already this fast in 1.5, as my nightly I used is 24 days old.

Didn’t think much about usage outside of a main function (type stability and making sure the compiler knows the types was one of the big goals with ArgMacros) but I guess maybe Julia is just at the point right now where it makes more sense not to use one for short scripts. Might add untyped versions of all of the macros when I finish a version 1.0 if this could be useful, or a different version of @beginarguments that disables typing.

I think it does make sense, just try julia -O0 if it’s very short running.

An interpreter can also in practice (for short-running code) be faster, and I haven’t looked enough at Home · JuliaInterpreter.jl it may be for other purposes (only for debug?).

For [EDIT: some; e.g. very short-running] scripts I would really run with these defaults, that have the fastest startup e.g. over 5x faster, and not too bad performance:

$ ~/julia-1.6.0-DEV-8f512f3f6d/bin/julia -O0 --compile=min --startup-file=no

julia> @time (s = 0; for i in 1:100000 s+=i end)
  0.167227 seconds (398.95 k allocations: 7.614 MiB)  # not too bad, slower than 0.005288 seconds without default

Such default can e.g. enable 0.28 sec. time-to-first plot.

https://github.com/heliosdrm/GRUtils.jl/issues/61

Jeff reminded me of this undocumented option above my answer here:
https://github.com/timholy/Revise.jl/pull/484#issuecomment-633146603

https://github.com/JuliaLang/julia/issues/36017

I think it is a bit unfair to say that: Standrad julia:

julia> s = rand(10^6);

julia> @time sum(s);
  0.000364 seconds (1 allocation: 16 bytes)

--compile-min:

julia> s = rand(10^6);

julia> @time sum(s);
  3.618843 seconds (7.46 M allocations: 114.119 MiB, 0.20% gc time)

It’s good to know the option exist but I don’t think it can be recommended as a “I would really run with these defaults”-option.

3 Likes

You took out my context “For scripts”, for me, implying short-running code [EDIT: FYI: I’ve seen --compile=min be 3.7x faster, than Julia’s defaults, on a 30 sec, 12-line (excl. dependencies) script: Faster startup by PallHaraldsson · Pull Request #383 · JuliaInterop/RCall.jl · GitHub]. You can’t have the same good default for (very) long running (HPC) code, and short running scripts. There will always be trade-offs, at least two good defaults depending; and I admit, I see this third option --compile=min is only good for very short running scripts, after looking more into it. It still might be a good default while developing, before your optimization phase.

I was trying to find a good balance, minimizing compilation time, going towards Python-defaults, but I see I went way beyond that, with a similar loop there faster than with compile=min.

I tried “dogfooding” on one of my Julia scripts I made for work, where Julia’s default is 32% slower, by 5.3 sec than -O0 (that is also faster than -O1), or 44% slower on 1.6.0-DEV, by 5.86 sec, while the other option:

$ time ~/julia-1.3.1/bin/julia --startup-file=no fenics-db.jl

real	0m21,663s

$ time ~/julia-1.3.1/bin/julia --startup-file=no --compile=min fenics-db.jl

real	11m59,102s

Ouch. Still the header (what I put in using.jl) of the script is faster, and for some short running scripts the two second gain there might not be lost later in some script:

using CodecZlib  # a wrapper, probably -O0 ok.
using CSV           # highly tuned code. Probably optimization, shouldn't go lower, maybe it should selectively go higher and/or JLL?
using DataFrames

$ time ~/julia-1.6.0-DEV-8f512f3f6d/bin/julia --startup-file=no using.jl 

real	0m3,678s

$ time ~/julia-1.6.0-DEV-8f512f3f6d/bin/julia --startup-file=no --compile=min using.jl 

real	0m1,616s

In my ~/.julia/config/startup.jl I have only:

@time using Revise
println("Revise speed test")  # to remind me how slow it is loading, why you should use --startup-file=no  when benchmarking

I occasionally see very slow loading of even (this) one package, here I had forgotten to exclude it:


$ time julia fenics-db.jl
 31.301598 seconds (2.85 M allocations: 106.293 MiB, 0.50% gc time)
Revise speed test

“good to know the option exist”

Yes, why I added the Julia issue to add it to --help, but I’m getting skeptical, maybe it should remain under the other undocumented:

$ ~/julia-1.6.0-DEV-8f512f3f6d/bin/julia --help-hidden
julia [switches] -- [programfile] [args...]
 --compile={yes|no|all|min}Enable or disable JIT compiler, or request exhaustive compilation
[..]

Can scripts be precompiled into Cache files? Include the cache files before calling the script, thereby skipping the script precompilation.