my link had a typo, an admin fixed it but it broke the link
I had bad C++ compiler flags and that version now runs in about 1 minute, not 9
So my single threaded Julia, at 2m30s is now not so good as I thought.
but the code is well suited to multi-threading
10 threads at 27s real and 20 threads take 17s real and 40 threads (my max) take 15s real - so clearly it is not a linear speedup.
julia> versioninfo()
Julia Version 1.7.0
Commit 3bf9d17731 (2021-11-30 12:12 UTC)
Platform Info:
OS: Linux (x86_64-pc-linux-gnu)
CPU: Intel(R) Xeon(R) CPU E5-2660 v3 @ 2.60GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-12.0.1 (ORCJIT, haswell)
Environment:
JULIA_CPU_THREADS = 40
JULIA_NUM_THREADS = 40
It has two physical CPUs