Fluctuations when measuring execution time of linear algebra code

I rerun all of my experiments with the two changes we talked about (using main and GC.enable(false)). Overall, those changes seems to make things better, but there are still a few test cases that keep showing those fluctuations. I tried to reproduce those fluctuations with another MWE, adding some things that we do in our actual experiments:

  • This is a different test problem.
  • We put the operands into a tuple and use tuple unpacking when calling the function. We do this because it makes the implementation of our experiments simpler.
  • We call GC.gc() before disabling garbage collection. This seems to make the results more stable (if they are stable at all).

The code looks like this:

using DelimitedFiles
using LinearAlgebra
BLAS.set_num_threads(1)

function fun(M109::Array{Float64,2}, M110::Diagonal{Float64,Array{Float64,1}}, M111::Array{Float64,2}, M112::Array{Float64,2}, v113::Array{Float64,1})
  GC.gc()
  GC.enable(false)
  start = time_ns()
  out = transpose(M109)*M110*(M111+transpose(M112))*v113;
  finish = time_ns()
  GC.enable(true)
  return (tuple(out), (finish-start)*1e-9)
end

function main()

  M109 = rand(1150,1300)
  M110 = Diagonal(rand(1150))
  M111 = rand(1150,150)
  M112 = rand(150,1150)
  v113 = rand(150)
  matrices = (M109, M110, M111, M112, v113)

  iters = 500
  timings = Array{Float64}(undef, iters)
  for i = 1:iters
    out, time = fun(matrices...)
    timings[i] = time
  end
  file = open("timings.txt", "w")
  writedlm(file, timings)
  close(file)
end

main()

Now there is some element of randomness, so sometimes it’s very stable, and sometimes there are again those fluctuations:


Notice that again, when it’s stable, it’s running relatively “slow”, except for a few fast iterations in the beginning. The slow cases are about 1.4x slower than the fast cases. What could cause this behavior?

I agree in the sense that if this is how Julia behaves, this is what we should report. However, we run a lot of tests and also compute speedups; It would be useful to just have a single number to summarize those result, but I find it troublesome to summarize such results with a single number (no matter if it’s minimum, average or median).