I know that in MATLAB the default order of operations is left to right, so it can be very useful to use parenthesis to specify the optimal order of matrix multiplications. I was surprised to see that Julia seems to figure this out automatically, at least for this simple example.

using BenchmarkTools
A1 = rand(10, 1000) # for multiplying from the left
A2 = rand(1000, 10) # for multiplying from the right
B = rand(1000, 1000)
C = rand(1000, 1000)
# Small matrix on the left
julia> @btime A1*B*C;
1.851 ms (4 allocations: 156.34 KiB)
julia> @btime (A1*B)*C;
1.833 ms (4 allocations: 156.34 KiB)
julia> @btime A1*(B*C);
11.021 ms (4 allocations: 7.71 MiB)
# Small matrix on the right
julia> @btime B*C*A2;
891.395 μs (4 allocations: 156.34 KiB)
julia> @btime B*(C*A2);
894.580 μs (4 allocations: 156.34 KiB)
julia> @btime (B*C)*A2;
10.445 ms (4 allocations: 7.71 MiB)

Is there some documentation I can read about how this is done? And does Julia also find the optimal order for longer chains of matrix multiplications?
There is a related thread from 2018, which doesn’t quite answer this (and perhaps things have changed since then?).

I can’t answer your question. But the good thing about julia is that it is mostly written in julia.
So entering

@edit A1*B*C

leads to matmul.jl.
There you can find the optimisations for multiplying more than one matrix around line 1100.
In this case it is a bit harder to figure out what is going one, because (as in most languages) matrix multiplication is done via BLAS.

For your other question is it sufficient to enter

?*

to see that this optimization is used for 3 and 4 matrizes and requires at least Julia 1.7.

you can read documentation about linear algebra. This write

Linear algebra functions in Julia are largely implemented by calling functions from LAPACK. Sparse matrix factorizations call functions from SuiteSparse. Other sparse solvers are available as Julia packages.

A key observation is that Julia parses x * y * z as *(x, y, z) and all that’s needed to implement smart multiplication ordering is to write a method for *(x::AbstractMatrix, y::AbstractMatrix, z::AbstractMatrix) which checks the sizes and chooses multiplication order based on that.

You have two possible orderings for 2, then 5 for 4 matrices, and a combinatorial explosion of orderings for more, so I thought this package was very cool (but will not install in 1.0+):

The default in 1.7 should very often be good enough, and you would want to opt into opmtimization for longer chains with a package, since doing the optimization is costly (so resurrecting that package would be valuable, while no longer, as needed).

You wouldn’t always want left-to-right ordering (but I thought it would be a breaking change, requiring 2.0, as with) neither for:

As Rudi advised the relevant source code can be found using @edit (It is the file (…)/stdlib/v1.7/LinearAlgebra/src/matmul.jl).

There are functions _tri_matmul and _quad_matmul for the case of 3 and 4 matrix multiplications respectively. These functions evaluate the cost of each possible ordering and choose the best one before BLAS comes into play. It looks like there is no logic for evaluating the ordering for chains of 5 or more matrices. Probably because the number of possible orderings grows expontially like Palli mentioned.

Here is the code for the case of multiplying three matrices (~line 1120-1134 in matmul.jl):

*(A::AbstractMatrix, B::AbstractMatrix, C::AbstractMatrix) = _tri_matmul(A,B,C)
function _tri_matmul(A,B,C,δ=nothing)
n,m = size(A)
# m,k == size(B)
k,l = size(C)
costAB_C = n*m*k + n*k*l # multiplications, allocations n*k + n*l
costA_BC = m*k*l + n*m*l # m*l + n*l
if costA_BC < costAB_C
isnothing(δ) ? A * (B*C) : A * mat_mat_scalar(B,C,δ)
else
isnothing(δ) ? (A*B) * C : mat_mat_scalar(A*B, C, δ)
end
end