Speed comparison matrix multiplication in Julia

The point was that it wasn’t apples to apples as long as the axis orientation issue was not fixed. After fixing this, it is. It’s not just an optimization thing, but fundamental for comparing between the languages.

Great! :smiley:


WAT. do we know why?


This convention for ordering arrays is common in many languages like Fortran, Matlab, and R (to name a few). The alternative to column-major ordering is row-major ordering, which is the convention adopted by C and Python (numpy) among other languages.

Well, now I know. This should be on top page in bold :smiley:

I don’t know the details of it, but Tullio does not require LoopVectorization, and does a lot of nice performance stuff without it. If you let Tullio use LoopVectorization on top of that, you can get even better performance.


GitHub - mcabbott/Tullio.jl: ⅀, Fast & Slow section has some details.

1 Like

Well, the thing is almost everything of this section should be at the top of the page in bold :grinning_face_with_smiling_eyes: