Speeding up julia on aarch64

In trying to make this easier to diagnose, I made a package for running a small set of fixed benchmarks to assess performance across systems.

I added a baked-in reference result from my 2018 i7 Macbook pro, which can be compared against. Probably better to compare to a linux x86_64 machine, so I plan to switch it out.

Here this is being run on the Xavier NX aarch64 system, and compared to the reference MacOS system.

julia> using SystemBenchmark
julia> compareToRef(sysbenchmark())
13×5 DataFrames.DataFrame
│ Row │ cat     │ testname        │ ref_ms      │ res_ms      │ factor   │
│     │ String  │ String          │ Float64     │ Float64     │ Float64  │
├─────┼─────────┼─────────────────┼─────────────┼─────────────┼──────────┤
│ 1   │ cpu     │ FloatMul        │ 1.61e-6     │ 6.08e-7     │ 0.37764  │
│ 2   │ cpu     │ FloatSin        │ 5.681e-6    │ 8.68342e-6  │ 1.5285   │
│ 3   │ cpu     │ VecMulBroad     │ 4.72799e-5  │ 5.15783e-5  │ 1.09091  │
│ 4   │ cpu     │ MatMul          │ 0.000379541 │ 0.00091201  │ 2.40293  │
│ 5   │ cpu     │ MatMulBroad     │ 0.000165929 │ 0.000199591 │ 1.20287  │
│ 6   │ cpu     │ 3DMulBroad      │ 0.00184215  │ 0.0018017   │ 0.978042 │
│ 7   │ cpu     │ FFMPEGH264Write │ 230.533     │ 616.325     │ 2.67348  │
│ 8   │ mem     │ DeepCopy        │ 0.000207828 │ 0.000339916 │ 1.63556  │
│ 9   │ diskio  │ TempdirWrite    │ 0.196437    │ 0.070913    │ 0.360997 │
│ 10  │ diskio  │ TempdirRead     │ 0.0691485   │ 0.0176      │ 0.254525 │
│ 11  │ loading │ JuliaLoad       │ 282.547     │ 246.116     │ 0.871063 │
│ 12  │ loading │ UsingCSV        │ 1772.47     │ 3065.72     │ 1.72963  │
│ 13  │ loading │ UsingVideoIO    │ 4002.58     │ 15329.0     │ 3.82978  │

Observations:

  • Generally not too bad
  • Surprisingly faster on diskio
  • Matrix multiplication is slow
  • FFMPEG is slow to encode
  • Loading VideoIO is slow (perhaps an artifact code precompile invalidation issue?)

As for SystemBenchmark.jl, I’d definitely welcome suggestions/PRs on how to improve the tests or informative tests to add. It would be great to arrive at a fixed set of tests that can be relied on.
I also want to add some external non-julia benchmarks, but couldn’t find a nice cross-platform benchmarking lib that could be JLL-ed.

To be reliable in the long-run, this package should also lock down version numbers and capture system data.

3 Likes