In trying to make this easier to diagnose, I made a package for running a small set of fixed benchmarks to assess performance across systems.
I added a baked-in reference result from my 2018 i7 Macbook pro, which can be compared against. Probably better to compare to a linux x86_64 machine, so I plan to switch it out.
Here this is being run on the Xavier NX aarch64 system, and compared to the reference MacOS system.
julia> using SystemBenchmark
julia> compareToRef(sysbenchmark())
13×5 DataFrames.DataFrame
│ Row │ cat │ testname │ ref_ms │ res_ms │ factor │
│ │ String │ String │ Float64 │ Float64 │ Float64 │
├─────┼─────────┼─────────────────┼─────────────┼─────────────┼──────────┤
│ 1 │ cpu │ FloatMul │ 1.61e-6 │ 6.08e-7 │ 0.37764 │
│ 2 │ cpu │ FloatSin │ 5.681e-6 │ 8.68342e-6 │ 1.5285 │
│ 3 │ cpu │ VecMulBroad │ 4.72799e-5 │ 5.15783e-5 │ 1.09091 │
│ 4 │ cpu │ MatMul │ 0.000379541 │ 0.00091201 │ 2.40293 │
│ 5 │ cpu │ MatMulBroad │ 0.000165929 │ 0.000199591 │ 1.20287 │
│ 6 │ cpu │ 3DMulBroad │ 0.00184215 │ 0.0018017 │ 0.978042 │
│ 7 │ cpu │ FFMPEGH264Write │ 230.533 │ 616.325 │ 2.67348 │
│ 8 │ mem │ DeepCopy │ 0.000207828 │ 0.000339916 │ 1.63556 │
│ 9 │ diskio │ TempdirWrite │ 0.196437 │ 0.070913 │ 0.360997 │
│ 10 │ diskio │ TempdirRead │ 0.0691485 │ 0.0176 │ 0.254525 │
│ 11 │ loading │ JuliaLoad │ 282.547 │ 246.116 │ 0.871063 │
│ 12 │ loading │ UsingCSV │ 1772.47 │ 3065.72 │ 1.72963 │
│ 13 │ loading │ UsingVideoIO │ 4002.58 │ 15329.0 │ 3.82978 │
Observations:
- Generally not too bad
- Surprisingly faster on diskio
- Matrix multiplication is slow
- FFMPEG is slow to encode
- Loading VideoIO is slow (perhaps an artifact code precompile invalidation issue?)
As for SystemBenchmark.jl, I’d definitely welcome suggestions/PRs on how to improve the tests or informative tests to add. It would be great to arrive at a fixed set of tests that can be relied on.
I also want to add some external non-julia benchmarks, but couldn’t find a nice cross-platform benchmarking lib that could be JLL-ed.
To be reliable in the long-run, this package should also lock down version numbers and capture system data.