Numpy 10x faster than Julia ?! What am I doing wrong ?! [solved - julia faster now]

We’re still a factor 1000x away from the numpy version, which can hardly be explained by the difference in BLAS :slight_smile: Notice that the only BLAS call in my julia version should be the mul! for the matrix multiply - and since the matrices are tiny, there’s not going to be much of a difference between OpenBlas vs MKL.