Matrix multiplication and vector multiplication give different results!

I found that (slightly) different results coming from matrix-matrix vs. matrix-vector multiplication!

julia> m1 = randn(200, 300);

julia> m2 = randn(300, 400);

julia> m12 = m1 * m2;        # matrix-matrix multiplication

julia> m12col400 = m1 * m2[:, 400];        # matrix-vector multiplication

julia> m12[:, 400] == m12col400
false

julia> m12[:, 400] .- m12col400
200-element Vector{Float64}:
  0.0
  0.0
 -1.4210854715202004e-14
  1.7416623698807143e-15
 -7.521760991835436e-15
  1.2434497875801753e-14
  4.440892098500626e-15
  0.0
  2.6645352591003757e-15
  5.329070518200751e-15
  3.552713678800501e-15
  1.7763568394002505e-15
  3.552713678800501e-15
  ⋮
  3.552713678800501e-15
 -7.105427357601002e-15
  1.3322676295501878e-15
 -3.552713678800501e-15
  3.1086244689504383e-15
 -1.7763568394002505e-15
 -3.552713678800501e-15
 -3.552713678800501e-15
 -1.7763568394002505e-15
 -3.552713678800501e-15
  1.0658141036401503e-14
  1.2434497875801753e-14
  0.0
1 Like

It’s not the problem of floating point resolution. It’s a matter of different implementations in matrix-matrix vs matrix-vector multiplication!

Yes, this is covered in that PSA a bit further down:

With lots of stuff in hardware, doing things like vectorization or reassociation (changing the order of operations) will change results.

julia> v = rand(10000);

julia> sum(v)
5004.669545612722

julia> let s = 0.0
           for x in v
               s += x
           end
           s
       end
5004.669545612701

Here’s an easy way to see why this happens:

julia> (1 + 1e20) - 1e20
0.0

julia> 1 + (1e20 - 1e20)
1.0

Matrix multiplication does all sorts of stuff that can change the order of operations, or even use reduced / improved accuracy fma operations, etc. This will naturally cause discrepancies like those you saw.

This is why people are advised to use isapprox () instead of equality:

julia> m12[:, 400] == m12col400
false

julia> m12[:, 400] ≈ m12col400
true

These two results are equal up to the sorts of resolution guarenteed by the algorithms we use.

I don’t see any surprise. One is a BLAS level-2 operation (gemv), the other one is a BLAS level-3 operation (gemm), they normally have different implementations in BLAS libraries.