Why do I get consistently worse performance running as a script than from REPL?

julia_ma_dot_bench.jl:

using LinearAlgebra, BenchmarkTools
import MutableArithmetics
const MA = MutableArithmetics;
setprecision(128);
sleep(2)
@btime MA.buffered_operate_to!(b, o, dot, x, y) setup=(o=BigFloat(); n=50; x=rand(BigFloat,n); y=rand(BigFloat,n); b=MA.buffer_for(dot,typeof(x),typeof(y)););

Running it as a script (noninteractively):

$ ./julia /home/nsajko/julia_ma_dot_bench.jl 
  9.418 μs (0 allocations: 0 bytes)
$ ./julia /home/nsajko/julia_ma_dot_bench.jl 
  9.418 μs (0 allocations: 0 bytes)
$ ./julia /home/nsajko/julia_ma_dot_bench.jl 
  9.368 μs (0 allocations: 0 bytes)

Running the code by copying to the REPL:

julia> @btime ...
  8.977 μs (0 allocations: 0 bytes)

julia> @btime ...
  8.830 μs (0 allocations: 0 bytes)

julia> @btime ...
  8.843 μs (0 allocations: 0 bytes)

This is on today’s nightly, I don’t know if that matters. Version info:

julia> versioninfo()
Julia Version 1.11.0-DEV.230
Commit 2c45e3ba9a2 (2023-08-04 00:35 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 8 × AMD Ryzen 3 5300U with Radeon Graphics
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver2)
  Threads: 1 on 8 virtual cores

What happens if you put several @btime in the script?

1 Like

Maybe your julia process has a lower process priority than wherever your REPL is running?

Could it be that you need to do:

./julia --startup-file=no julia_ma_dot_bench.jl

I don’t know if you have a Julia startup file but it might be doing something different for your REPL session. Also you want to benchmark code without it, and I wish it were not on by default for scripts, but people don’t agree…

I did:

GC.gc(true)  # Note, I think not needed, implied by @btime?

julia> @benchmark MA.buffered_operate_to!(b, o, dot, x, y) setup=(o=BigFloat(); n=50; x=rand(BigFloat,n); y=rand(BigFloat,n); b=MA.buffer_for(dot,typeof(x),typeof(y));)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  12.903 μs … 67.682 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     13.901 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   14.429 μs ±  2.732 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

And with @btime I got (which shows median):

$ julia --startup-file=no julia_ma_dot_bench.jl
  13.133 μs (0 allocations: 0 bytes)

I tried to use @benchmark from a script but I get no output then… even without the ; to not suppress output (in the REPL).

I think @btime shows the minimum time, not median, from the docs

@btime prints the minimum time and memory allocation before returning the value of the expression …

The print from @benchmark seems to just be the returned object being shown, so maybe if you just wrapped th benchmark in a print it would work in a script as well?