Numpy incorrectly calls a bunch of things “dot”, in Julia, matrix multiplication is just *:
x' * y #or transpose(x) * y
notice you’re unlikely to get speed up for this because everyone is just calling OpenBLAS routine anyway.
Also checkout the very fast einsum pkg Tullio.jl (still, not gonna be faster for dense CPU matrix, OpenBLAS is super optimized, partly by hand written architecture specific assembly)