Is matrix multiplication not associative?

Yeguarr · March 21, 2025, 4:09pm

Hi! I have a very simple code snippet:

let
	B = [1;1]
	K = [1 2]
	x = [1;2]

	(B*K)*x #Works fine
	B*(K*x) #MethodError: no method matching *(::Vector{Int64}, ::Vector{Int64})
end

It seems that Julia automatically converts the matrix to a vector for the second case. Is there a way to stop this behavior? I want to define a variable u = K*x, but this breaks my code.

Julia version: 1.11.4.

danielwe · March 21, 2025, 4:20pm

The issue is that [1 2] is a matrix, not a row vector, so [1 2] * [1; 2] returns a vector of length one, not a scalar. To get a row vector, type [1; 2]' or transpose([1; 2]) or adjoint([1; 2]).

julia> let
           B = [1; 1]
           K = [1; 2]'
           x = [1; 2]

           @show (B * K) * x # Works fine
           @show B * (K * x) # Works fine
       end;
(B * K) * x = [5, 5]
B * (K * x) = [5, 5]

Yeguarr · March 21, 2025, 4:24pm

Wow! Thank you!

danielwe · March 21, 2025, 4:34pm

The inconsistency is arguably that B * K does not error here—a column vector left-multiplied with a matrix doesn’t really make sense. But since it’s possible to interpret every column vector as a single-column matrix, there apparently exists a multiplication method that treats B as such and computes the resulting matrix product.

The converse “demotion” of K to a row vector would not make sense because K * x is a regular, valid matrix-vector product returning a vector, and Julia can’t know that you actually meant to think of K as a row vector instead.

apo383 · March 21, 2025, 6:49pm

This is a common pattern in control systems, which has a slightly different interpretation. B * K is intended to be an outer product (a column multiplied by a row) that yields a matrix. People use it a lot and know what it means, so it shouldn’t error.

Rather, I think the issue is that @Yeguarr intends x = [1; 2] as a 1-column matrix (as in Matlab) and not a vector. In Julia, K*x is a 1-element vector and neither a scalar nor a 1x1 matrix, which both work and mean the same in Matlab. In Julia, a way to do it is here:

julia> B = [1;1] # vector
julia> K = [1 2] # row
julia> x1col = [1 2]' # ugly but means 1 column matrix not a Julia vector
julia> (B*K)*x1col # works fine
julia> B*(K*x1col) # also works fine
2×1 Matrix{Int64}:
 5
 5

This is one of the rare cases where Matlab does “what you want.” I do believe Julia’s vector approach is otherwise far superior to Matlab overall, but it’s a bit unwieldy for this common controls pattern.

I avoid x = [1; 2] in Julia because it means the same as x = [1, 2]. I mentally want to parse semicolon as row delimiter in a matrix, which it usually is, just not here. Likewise I would still say B = [1, 1] to be clear to myself. Even though (B*K) works fine with B as a vector, I don’t want to fool myself or others into thinking it’s a 1-column matrix.

For controls people, Matlab’s “everything is a matrix” approach works out remarkably well. It also has a bunch of unfortunate ambiguities outside of controls.

Yeguarr · March 21, 2025, 6:57pm

Thank you for your comment! You are right, I’m coming from Matlab where exactly the same code works fine, never thought it would be an issue here.

I think in my case it would be better to define both B and K as matrices (because they are transformations in my case) and leave x as a vector. So updated code:

B = [1 1]' #wish there was a better way
K = [1 2]
x = [1;2]

(B*K)*x #Works fine
B*(K*x) #Works fine

danielwe · March 21, 2025, 7:23pm

An outer product maps two vectors to a matrix. It’s equivalent to a matrix product between a single-column matrix and a single-row matrix. But what’s happening in this case is neither of these—B * K multiplies a (column) vector on the left by a matrix on the right. This implies contracting over a dimension that isn’t actually there. I agree that it’s convenient, and it’s likely harmless because it’s unambiguous—how else would you make sense of column vector times matrix? But formally speaking it’s inconsistent—Julia usually enforces the distinction between a vector and a single-column matrix, but not here—and this inconsistency is what breaks associativity in the original example.

In Matlab, everything is a matrix, including “scalars”, so you can’t run into these subtleties. Convenient in many cases, absurd in others.

danielwe · March 21, 2025, 7:24pm

Here’s how:

julia> [1; 2;;]
2×1 Matrix{Int64}:
 1
 2

danielwe · March 21, 2025, 7:42pm

To elaborate on this syntax: in array literals, n semicolons increments the n-th index. Thus, you can instantiate arrays of arbitrary dimension:

julia> [1; 2;;; 3; 4]
2×1×2 Array{Int64, 3}:
[:, :, 1] =
 1
 2

[:, :, 2] =
 3
 4

In [1; 2;;], the final ;; forces the existence of a second index, obtaining a column matrix.

The Matlab-like matrix literal syntax is just extra sugar: spaces increment the second index, like ;;, and line breaks increment the first index, like ; (with the extra subtlety that if you use spaces instead of ;; you should enter elements by row instead of by column).

stevengj · March 21, 2025, 7:47pm

There’s an argument that we should have a vector * vector method that treats the left vector as a 1-column matrix and works if the right vector has 1 element, throwing an error otherwise. This would be more consistent with other methods treating vectors as 1-column matrices when the shapes work.

digital_carver · March 22, 2025, 4:16am

Just for future reference, another practical solution that works with the original arrays as they are is:

let
	B = [1;1]
	K = [1 2]
	x = [1;2]

	@show (B*K)*x 
	@show B*only(K*x) 
end;

It sounds like turning B into a Matrix works better conceptually for you in this case, but only is a generally useful function to keep in mind, to turn a single element Array into a scalar value.

WalterMadelim · March 22, 2025, 4:56am

I have a question: Does x' always resolves into transpose(x), as long as Complex numbers are not involved?

DNF · March 22, 2025, 7:27am

Just for notational clarity: the standard syntax for vector literals is [1, 2], that is, commas instead of semicolons. They end up doing the same thing, but using commas sets them apart visually, which might be beneficial.

stevengj · March 22, 2025, 12:19pm

x' always corresponds to adjoint(x), but yes: adjoint is (recursive) transposition for real numbers.

WalterMadelim · March 22, 2025, 1:45pm

Maybe I’ll stick to transpose, for being rigorous, although lengthy.
(I don’t work with Complex numbers)

stevengj · March 22, 2025, 4:46pm

Why do you think transpose is more “rigorous” for real arrays?

(For most applications in linear algebra, I would say that adjoint is generally what you want, for either real or complex arrays. The reason this is an important operation in linear algebra is its relationship to inner products.)

WalterMadelim · March 22, 2025, 11:18pm

In the exact literal meaning, transpose is the essentially desired operation (since Complex number is not involved in my context). At least they show up differently in Julia REPL

julia> transpose(x)
1×3 transpose(::Vector{Float64}) with eltype Float64:
 0.715859  0.0920015  0.183793

julia> x'
1×3 adjoint(::Vector{Float64}) with eltype Float64:
 0.715859  0.0920015  0.183793

Not only inner product, maybe equally important in outer product as well.
I may transpose other objects as well (e.g. Array{JuMP.GenericVariableRef{Float64}, 1} in this post)

stevengj · March 22, 2025, 11:24pm

Couldn’t you just as easily say that the adjoint is the desired operation?

WalterMadelim · March 22, 2025, 11:31pm

transpose is deemed a more fundamental operation, e.g., in my post link above
adjoint comprises transpose and conjugate. (the latter is provisionally not needed for me)

stevengj · March 22, 2025, 11:38pm

I would disagree. In linear algebra, the more fundamental operation is the adjoint ^* (' in Julia), which is defined by the inner product \langle x, y \rangle = x^* y for vectors, and \langle x, A y \rangle = \langle A^* x, y\rangle for any linear operator A.

For real vectors with the usual Euclidean inner product, the adjoint coincides with what people call the “transpose”, and this is the main reason why transposition is so common and has a special name. (As opposed to many other rearrangements you could imagine, like flipping a matrix or vector upside down or rotating a matrix 90°, which don’t have a standard name.) Every time you see a transpose in linear algebra, there’s a good chance that (a) you’re working with real numbers, (b) there is an inner product lurking implicitly nearby, and (c) if you had complex numbers you’d probably need the conjugate-transpose instead.

But anyway, this is a philosophical point. You are free to write transpose(x) instead of x' in Julia if you want; the two are functionally and computationally equivalent for real vectors and matrices.

(There are fairly rare exceptions where you transpose complex vectors without conjugation; that’s why it’s still useful for Julia to have a separate transpose function. If you’re not dealing with linear algebra at all and just want to rearrange a container, there is permutedims.)

Topic		Replies	Views
Matrix multiplication - inconsistent behaviour General Usage question	50	2909	May 16, 2020
Array type modification New to Julia	48	2509	August 17, 2018
Faulty Transpose and Adjoint Multiplication General Usage linearalgebra	28	2084	August 20, 2018
Vector vs Column vector New to Julia question	13	9466	February 21, 2020
Vector*matrix in 0.6 General Usage	16	2879	February 2, 2017

Is matrix multiplication not associative?

Related topics