I’ve been observing a weird behaviour with a piece of code I’ve been working on. The running time, as returned by BenchmarkTools, will suddenly jump from around 60ms to over a minute. Once that happens, the running time stays around a minute and does not return to the previous value. I’ve been wrecking my brain trying to figure out what might be going on. One clue is that, once the running time jump happens, profiling the code using Profile reveals that the vast majority of time appears to be spent in @Base/task.jl. If I profile the exact same code into different REPL sessions, one in which the running time is normal and one in which it is long, it appears that those lines do indeed account for the difference.
I can see how something like that could happen by chance intermittently; a hiccup in the task scheduling mechanism, or something, but as I mentioned, once the running time enters the “long regime”, it tends to stay there.
I initially discovered this when running a lengthy calculation, where this specific code plays an important part, and noticed that running time would fluctuate from around 2-3 minutes to over an hour. This suggests that whatever triggers the change in running time is not intermittent, but that some thing else is going on.
I realize this is all a big vague, and I’m trying to come up with a scenario that can reliably trigger the change, but so far I haven’t been able to nail it down. If anyone has any idea of what might be going on, or any suggestions of how to debug it, I would be grateful.
The piece of code the undergoes the running time weirdness can be found here, and this is how I’ve been testing it:
using StandardBasisVectors, BenchmarkTools, Profile, LinearAlgebra
q,r = qr(randn(75,75))
Q = Matrix{Float64}(q)
@benchmark standardize_basis($Q)
BenchmarkTools.Trial: 77 samples with 1 evaluation.
Range (min … max): 57.890 ms … 67.230 ms ┊ GC (min … max): 0.00% … 0.00%
Time (median): 65.026 ms ┊ GC (median): 0.00%
Time (mean ± σ): 64.961 ms ± 1.300 ms ┊ GC (mean ± σ): 0.05% ± 0.43%
▁▃▁▁ ▆▁█▃▆██▃ ▁▃
▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄▇████▄▄████████▇▄██▄▄▇▄▄▇ ▁
57.9 ms Histogram: frequency by time 67.2 ms <
Memory estimate: 804.23 KiB, allocs estimate: 8642.
# after running time change
@benchmark StandardBasisVectors.standardize_basis($Q)
BenchmarkTools.Trial: 5 samples with 1 evaluation.
Range (min … max): 1.066 s … 1.101 s ┊ GC (min … max): 0.00% … 0.00%
Time (median): 1.072 s ┊ GC (median): 0.00%
Time (mean ± σ): 1.077 s ± 13.807 ms ┊ GC (mean ± σ): 0.00% ± 0.00%
█ █ █ █ █
█▁▁▁▁█▁▁▁█▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
1.07 s Histogram: frequency by time 1.1 s <
Memory estimate: 804.23 KiB, allocs estimate: 8642.
The result of running @profile StandardBasisVectors.standardize_basis(Q)
can be seen here
I’m sure there is something obvious wrong here that I just can’t see.
Edit: Julia version info
versioninfo()
Julia Version 1.10.0
Commit 3120989f39b (2023-12-25 18:01 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: macOS (x86_64-apple-darwin22.4.0)
CPU: 12 × Intel(R) Core(TM) i7-8700B CPU @ 3.20GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, skylake)
Threads: 1 on 12 virtual cores