Is matrix multiplication not associative?

Hi! I have a very simple code snippet:

let
	B = [1;1]
	K = [1 2]
	x = [1;2]

	(B*K)*x #Works fine
	B*(K*x) #MethodError: no method matching *(::Vector{Int64}, ::Vector{Int64})
end

It seems that Julia automatically converts the matrix to a vector for the second case. Is there a way to stop this behavior? I want to define a variable u = K*x, but this breaks my code.

Julia version: 1.11.4.

The issue is that [1 2] is a matrix, not a row vector, so [1 2] * [1; 2] returns a vector of length one, not a scalar. To get a row vector, type [1; 2]' or transpose([1; 2]) or adjoint([1; 2]).

julia> let
           B = [1; 1]
           K = [1; 2]'
           x = [1; 2]

           @show (B * K) * x # Works fine
           @show B * (K * x) # Works fine
       end;
(B * K) * x = [5, 5]
B * (K * x) = [5, 5]
4 Likes

Wow! Thank you!

1 Like

The inconsistency is arguably that B * K does not error here—a column vector left-multiplied with a matrix doesn’t really make sense. But since it’s possible to interpret every column vector as a single-column matrix, there apparently exists a multiplication method that treats B as such and computes the resulting matrix product.

The converse “demotion” of K to a row vector would not make sense because K * x is a regular, valid matrix-vector product returning a vector, and Julia can’t know that you actually meant to think of K as a row vector instead.

2 Likes

This is a common pattern in control systems, which has a slightly different interpretation. B * K is intended to be an outer product (a column multiplied by a row) that yields a matrix. People use it a lot and know what it means, so it shouldn’t error.

Rather, I think the issue is that @Yeguarr intends x = [1; 2] as a 1-column matrix (as in Matlab) and not a vector. In Julia, K*x is a 1-element vector and neither a scalar nor a 1x1 matrix, which both work and mean the same in Matlab. In Julia, a way to do it is here:

julia> B = [1;1] # vector
julia> K = [1 2] # row
julia> x1col = [1 2]' # ugly but means 1 column matrix not a Julia vector
julia> (B*K)*x1col # works fine
julia> B*(K*x1col) # also works fine
2Ă—1 Matrix{Int64}:
 5
 5

This is one of the rare cases where Matlab does “what you want.” I do believe Julia’s vector approach is otherwise far superior to Matlab overall, but it’s a bit unwieldy for this common controls pattern.

I avoid x = [1; 2] in Julia because it means the same as x = [1, 2]. I mentally want to parse semicolon as row delimiter in a matrix, which it usually is, just not here. Likewise I would still say B = [1, 1] to be clear to myself. Even though (B*K) works fine with B as a vector, I don’t want to fool myself or others into thinking it’s a 1-column matrix.

For controls people, Matlab’s “everything is a matrix” approach works out remarkably well. It also has a bunch of unfortunate ambiguities outside of controls.

4 Likes

Thank you for your comment! You are right, I’m coming from Matlab where exactly the same code works fine, never thought it would be an issue here.

I think in my case it would be better to define both B and K as matrices (because they are transformations in my case) and leave x as a vector. So updated code:

B = [1 1]' #wish there was a better way
K = [1 2]
x = [1;2]

(B*K)*x #Works fine
B*(K*x) #Works fine
2 Likes

An outer product maps two vectors to a matrix. It’s equivalent to a matrix product between a single-column matrix and a single-row matrix. But what’s happening in this case is neither of these—B * K multiplies a (column) vector on the left by a matrix on the right. This implies contracting over a dimension that isn’t actually there. I agree that it’s convenient, and it’s likely harmless because it’s unambiguous—how else would you make sense of column vector times matrix? But formally speaking it’s inconsistent—Julia usually enforces the distinction between a vector and a single-column matrix, but not here—and this inconsistency is what breaks associativity in the original example.

In Matlab, everything is a matrix, including “scalars”, so you can’t run into these subtleties. Convenient in many cases, absurd in others.

2 Likes

Here’s how:

julia> [1; 2;;]
2Ă—1 Matrix{Int64}:
 1
 2
4 Likes

To elaborate on this syntax: in array literals, n semicolons increments the n-th index. Thus, you can instantiate arrays of arbitrary dimension:

julia> [1; 2;;; 3; 4]
2Ă—1Ă—2 Array{Int64, 3}:
[:, :, 1] =
 1
 2

[:, :, 2] =
 3
 4

In [1; 2;;], the final ;; forces the existence of a second index, obtaining a column matrix.

The Matlab-like matrix literal syntax is just extra sugar: spaces increment the second index, like ;;, and line breaks increment the first index, like ; (with the extra subtlety that if you use spaces instead of ;; you should enter elements by row instead of by column).

3 Likes

There’s an argument that we should have a vector * vector method that treats the left vector as a 1-column matrix and works if the right vector has 1 element, throwing an error otherwise. This would be more consistent with other methods treating vectors as 1-column matrices when the shapes work.

5 Likes

Just for future reference, another practical solution that works with the original arrays as they are is:

let
	B = [1;1]
	K = [1 2]
	x = [1;2]

	@show (B*K)*x 
	@show B*only(K*x) 
end;

It sounds like turning B into a Matrix works better conceptually for you in this case, but only is a generally useful function to keep in mind, to turn a single element Array into a scalar value.

1 Like

I have a question: Does x' always resolves into transpose(x), as long as Complex numbers are not involved?

Just for notational clarity: the standard syntax for vector literals is [1, 2], that is, commas instead of semicolons. They end up doing the same thing, but using commas sets them apart visually, which might be beneficial.

x' always corresponds to adjoint(x), but yes: adjoint is (recursive) transposition for real numbers.

Maybe I’ll stick to transpose, for being rigorous, although lengthy.
(I don’t work with Complex numbers)

Why do you think transpose is more “rigorous” for real arrays?

(For most applications in linear algebra, I would say that adjoint is generally what you want, for either real or complex arrays. The reason this is an important operation in linear algebra is its relationship to inner products.)

1 Like

In the exact literal meaning, transpose is the essentially desired operation (since Complex number is not involved in my context). At least they show up differently in Julia REPL

julia> transpose(x)
1Ă—3 transpose(::Vector{Float64}) with eltype Float64:
 0.715859  0.0920015  0.183793

julia> x'
1Ă—3 adjoint(::Vector{Float64}) with eltype Float64:
 0.715859  0.0920015  0.183793

Not only inner product, maybe equally important in outer product as well.
I may transpose other objects as well (e.g. Array{JuMP.GenericVariableRef{Float64}, 1} in this post)

Couldn’t you just as easily say that the adjoint is the desired operation?

transpose is deemed a more fundamental operation, e.g., in my post link above
adjoint comprises transpose and conjugate. (the latter is provisionally not needed for me)

I would disagree. In linear algebra, the more fundamental operation is the adjoint ^* (' in Julia), which is defined by the inner product \langle x, y \rangle = x^* y for vectors, and \langle x, A y \rangle = \langle A^* x, y\rangle for any linear operator A.

For real vectors with the usual Euclidean inner product, the adjoint coincides with what people call the “transpose”, and this is the main reason why transposition is so common and has a special name. (As opposed to many other rearrangements you could imagine, like flipping a matrix or vector upside down or rotating a matrix 90°, which don’t have a standard name.) Every time you see a transpose in linear algebra, there’s a good chance that (a) you’re working with real numbers, (b) there is an inner product lurking implicitly nearby, and (c) if you had complex numbers you’d probably need the conjugate-transpose instead.

But anyway, this is a philosophical point. You are free to write transpose(x) instead of x' in Julia if you want; the two are functionally and computationally equivalent for real vectors and matrices.

(There are fairly rare exceptions where you transpose complex vectors without conjugation; that’s why it’s still useful for Julia to have a separate transpose function. If you’re not dealing with linear algebra at all and just want to rearrange a container, there is permutedims.)

4 Likes