Matrix multiplication - inconsistent behaviour

OK- thanks. I may be able to use that.

This is exactly what I tried to tell you.

2 Likes

I think the source of your issue isn’t matrix multiplication, it’s the fact that indexing with a scalar drops that dimension. This is consistent with the idea that in indexing the dimension of the result should match the dimensionality of the index. Scalars are 0-dimensional, so that dimension is dropped.

julia> A = rand(2,3,4);

julia> size(A[1, 1, 1])
()

julia> size(A[1:1, 1, 1])
(1,)

julia> size(A[1:1, 1:1, 1])
(1, 1)

julia> size(A[1:1, [1, 2], 1])
(1, 2)

julia> size(A[1:1, [1 2; 2 3], 1])
(1, 2, 2)

I think to get the behavior you want you’d want to overload getindex(::Array), which is the function that’s called for indexing, e.g. the method called for A[1, 2] is getindex(::Array, ::Int64, ::Int64). You can see this with @which:

julia> @which A[1,1]
getindex(A::Array, i1::Int64, i2::Int64, I::Int64...) in Base at array.jl:745

If you overloaded getindex so that indexing with a scalar kept that dimension, then M[:, 1] would give you a Nx1 Matrix, and M[1,:] would give you a 1xN Matrix.

You can poke around at that code to see how the dispatch happens if you want to tweak things. What other folks are warning about is that this will likely break other people’s code if you use any other packages (or if you publish a package that someone else uses). That said, if you’re interested in this, there’s nothing wrong with playing around with it, and likely learning more about Julia internals along the way. Just know that you’re modifying the global method table, so you may end up getting into a state where you want to restart your interpreter to get things back to the default state.

Happy hacking. :slight_smile:

6 Likes

Hi ssfr,

Thanks so much for your constructive, informative and helpful message. That really helps me!

I have indeed made various ‘hacks’ to my interface with Julia that have made my experience much more to my liking, with no major disasters yet [though I note your comment about my needing to be careful if post code].

Despite some personal attacks above, I have learned something from this thread, and can see that hacking here might be more problematic.

On another note, I wonder if there exists anywhere a post called ‘Julia gotchas’, which are things in Julia which might seem ‘unexpected’ to some.

For instance, most of the introductory sections for Julia have the same material about matrix algebra, multiplication, etc., but I never recall anywhere seeing ‘but watch out for the difference between A[:,1] and A[:,1:1]’ in any of the brief introductions…

Anyway, thanks so much!

And, my reading of the thread is that no one personally attacked you. Rather, everyone who replied to your post has been cordial. Providing some pushback and asking you to clarify your question is par for the course, but if you’re going to view these replies as “judgemental”, then you’ll not gain as much from the forum as you wish to.

2 Likes

Hi, yes, on balance, cordial.

I reserve my right to ‘hack’, though, and am not apologetic about it!

I think that’s quite enough. Up until now there was one single personal attack in this thread, and that was made by you:

Your smug and rude attitude has turned me off from providing any more help to you, as I’m sure you’ll be glad to hear.

6 Likes

I have appreciated your help.

Why not just

julia> mat1=randn(3,3); mat2=randn(3,2);

julia> mat1*mat2
3×2 Array{Float64,2}:
  1.05309  -1.18784
  1.50715  -1.45982
 -1.28743  -1.23848

julia> mat1[1,:]'*mat2[:,2]
-1.1878401856079612

julia> mat1[1,:]'mat2[:,2]
-1.1878401856079612

Transpose (take the adjoint of) the left vector?

Thanks for the input, but if the indices are variables, then they may sometimes be length more than 1, in which case this fix would need to be conditional. I was trying to avoid special conditions.

Just so it’s clear, a length-1 vector will behave the exact the same as a length-n list when used as an index. The issue is that indexing with a length-1 list is not the same thing as indexing with a scalar.

In many cases a 1x1x1 array, 1x1 matrix, 1-item vector, and a scalar are interchangeable, but they are different types of thing, so there are places (like indexing) where they give you slightly different results.

1 Like

Hi!

Just a small advice. I think many people coming from other languages hope that Julia behaves like those other languages (I have been once in this ship since I came from MATLAB). However, my advice is: do not try to make Julia behave like the other language (which seems to be what you are doing), but try to understand the Julia way. In this case, I can assure you will fall in love when you realize how consistent it is and how fun it can be :slight_smile:

8 Likes

With regards to 1x1 matrices, it not always works as expected.

The code below gives an error that the * operator is not matching the arguments:
MethodError: no method matching *(::Array{Int64,1}, ::Array{Int64,1}).
For the 2x2 * 2x1 multiplication it works, but for the 1x1 * 1x1 does not.
If I would write A*B', the 1x1 * 1x1 works, and the 2x2 * 2x1 case breaks as expected.

Is there a not too convoluted workaround without writing my own matmul function?
Did I miss something?
(I’m not very knowledgable with Julia, but this problem feels kinda strange.)

f(A,B) = A*B

A=[1 0;0 1]
B=[2; 3]
f(A,B)

A=[1]
B=[1]
f(A,B)

I think the problem could be solved by making a special case for *(::Array{T,1}, ::Array{T,1}) when checking compatible arrays. (However I’m a newbie for Julia, I don’t really know if it works this way.)

You don’t have 1x1 matrices in the case of A = [1]. Those are one-vectors. If you actually use 1x1 matrices, then this works as you expect:

julia> f(A,B) = A*B
f (generic function with 1 method)

julia> A = fill(1, (1, 1))
1×1 Array{Int64,2}:
 1

julia> B = fill(1, (1, 1))
1×1 Array{Int64,2}:
 1

julia> f(A, B)
1×1 Array{Int64,2}:
 1
4 Likes

Thanks for the quick reply!

To ease the frustration for those who come after me, a quick fix:

A=A[:,:] # Array to Matrix conversion
f(A,B)

Only needed on the first operand, works as expected and not pollutes the code too much.

This thread remembers me of this discussion I had. Summarizing: eachrow returns a Vector that is a column, not a row, because dimensions are dropped.

I remember after finding a thread of the Julia founders (I think a issue in the Julia Language Github) where they decided to start dropping dimensions (this was not always the Julia language behavior). I do not remember the exact reasoning anymore, but I remember to think ‘hmm, there is much else involved that I did not think of, for many problems dropping the dimensions is the most elegant/seamless decision’. That said, I think in some cases the code is much more annoying to write because of this behavior. As others said, this kind of type piracy (not all type piracy) is considerable dangerous. However, instead of “ajusting” I would recommend you to consider if it is worth creating a new Array type (a subtype of AbstractArray) that does not possess such behaviour (never drops dimensions), use it in your code, and see if for some cases it is better to work with. It could make an interesting package. We have packages for vectors starting in random/arbitrary offsets, why not arrays that do not drop dimensions when they become unitary.

What a great idea!!!

Found the thread: https://github.com/JuliaLang/julia/issues/4774

Note that it is one of the largest Julia issue threads. It has 417 comments, most of them well written treatises of many paragraphs, spanning from November 2013 to January 2017. I understand your annoyance with this feature, as I share the discomfort, but it was not an unplanned side-effect that arose among distracted language designers, it was seriously discussed, studied, and the current behavior was considered the lesser of two evils.

11 Likes

Yes 4774 is one of those epic issue numbers that many of us have memorized. Beyond the issue itself, it has spun off several talks, a paper, and even a :4774: emoji in honor of its author :4774:.

7 Likes

The idea of a new type seems to get around this by letting people choose for themselves whether or not the potential pitfalls that have been envisaged are relevant to their applications. The addition of a new type would not hurt anyone, as nobody would be required to use it.

1 Like