Article - How fast is your programming language

I get 1.95 sec. something like 24% speedup for the matmul if I run in the REPL (and add @simd), partial timing here:

@time a = matgen(n)
0.005693 seconds (2 allocations: 17.166 MiB)

@time c = matmul(n, a, b);
  1.932954 seconds (2 allocations: 17.166 MiB)

So it’s worth it to AOT compile at least that one. And the for loops need to be split in two, i.e. dual-for loop doesn’t take @simd but maybe should?

And I’m unclear why to I get 2, not one allocation each?