What LLVM version to use, 10, 11 possible? And how to reduce startup time (for the Benchmark Game)?

A.
I know I’m using LLVM 9.0.1, but is latest 10 (or 11) coming to master?

I’m just curious if Julia will optimize better (it’s already good, better than C), no rush.

It seems JuliaLang/julia uses LLVM, not LLVM from Yggrasil (where there is version 10).

I’m a bit confused seeing 8 here, and 9 before, what’s he meaning of that:

and what is:

Can you switch LLVM at runtime?

B.
I was looking into Fasta at the Benchmark Game, and despite i looking to be 2.3x slower than Haskell (fastest) and C, it’s actually faster than the C code on my machine, when excluding the startup cost:

With PackageCompiler.jl I can reduce the startup cost, but not to zero as in the REPL. I know about the filtered option, how much is possible to filter out in theory? All but the GC? E.g. LLVM (for that program, I know, in general you might need it it). That program (and most) reles in the GC (and Base.Threads and IO), but let’s say it wouldn’t, could you also strip out the GC?

1 Like

A. LLVM 8 is used for julia 1.4. In general, you shouldn’t expect newer versions of LLVM to magically speed up code (sometimes it does, and that’s a nice surprise). Looks like LLVM 10 support is at least in-progress on master. Switching llvm used for codegen at runtime probably isn’t possible (but someone may contradict me on this).

B. Hi. I wrote that Fasta benchmark. There’s definitely room for improvement there. In general, I’ve found that using Channels has a large negative effect on compile-time. If you want to speed it up, the best way would be to rewrite it in a way that doesn’t rely on Channels. :slight_smile: You’ll get a lot more mileage out of that than the more exotic changes you’re suggesting.

The maintainer of the benchmarksgame isn’t likely to alter the Julia benchmarks to use a compile step like PackageCompiler (from previous conversations I’ve had with him).

Also, GC isn’t a standard library, so the standard library filtering option in PackageCompiler doesn’t apply. In general, filtering standard libraries shouldn’t affect performance; only the size of binary generated.

3 Likes

Yes, I know filtering will not affect the generated assembly.

Your program seems fastest as is, not sure about any definite improvement at runtime (only at compile time? and yes I want it excluded from the “runtime”…):

julia> include("Fasta/src/Fasta.jl")

julia> @btime Fasta.fasta(25000000, devnull)
  985.302 ms (41595 allocations: 956.91 MiB)

[The C program (and I guess the Haskell program, claimed fastest) is over 1 sec on my machine.]

However (next line applies to above and below):
$ export JULIA_NUM_THREADS=4

$ time julia-1.5-012b270df6/bin/julia -O3 -- fasta.jl 25000000 >/dev/null

real	0m2.019s
user	0m5.920s
sys	0m0.559s

$ time FastaAppCompiled/bin/Fasta 25000000 >/dev/null

real	0m1.786s
user	0m5.759s
sys	0m0.598s

I do not recall if I used this to compile your program:

export JULIA_LLVM_ARGS=-unroll-threshold=500

but it didn’t seem to matter, maybe with other value?

Interesting, but wouldn’t apply to PackageCompiler at runtime?