Inaccurate Matrix Multiplication

I’m not sure that’s a reliable test. I.e. would you get exact same code (or very similar), on the Julia side, if calling BLAS, and the differences there not showing?

If the deviation is actually considered too large (3 ulps) then could it rather be a software bug? I just doubt errata, while also a possibility. I believe all simple operations should have only 1 ulp error, but they add up, even with the few operations there (but clearly need not).