Matrix multiplication - inconsistent behaviour

Hi!

Just a small advice. I think many people coming from other languages hope that Julia behaves like those other languages (I have been once in this ship since I came from MATLAB). However, my advice is: do not try to make Julia behave like the other language (which seems to be what you are doing), but try to understand the Julia way. In this case, I can assure you will fall in love when you realize how consistent it is and how fun it can be :slight_smile:

8 Likes

With regards to 1x1 matrices, it not always works as expected.

The code below gives an error that the * operator is not matching the arguments:
MethodError: no method matching *(::Array{Int64,1}, ::Array{Int64,1}).
For the 2x2 * 2x1 multiplication it works, but for the 1x1 * 1x1 does not.
If I would write A*B', the 1x1 * 1x1 works, and the 2x2 * 2x1 case breaks as expected.

Is there a not too convoluted workaround without writing my own matmul function?
Did I miss something?
(I’m not very knowledgable with Julia, but this problem feels kinda strange.)

f(A,B) = A*B

A=[1 0;0 1]
B=[2; 3]
f(A,B)

A=[1]
B=[1]
f(A,B)

I think the problem could be solved by making a special case for *(::Array{T,1}, ::Array{T,1}) when checking compatible arrays. (However I’m a newbie for Julia, I don’t really know if it works this way.)

You don’t have 1x1 matrices in the case of A = [1]. Those are one-vectors. If you actually use 1x1 matrices, then this works as you expect:

julia> f(A,B) = A*B
f (generic function with 1 method)

julia> A = fill(1, (1, 1))
1×1 Array{Int64,2}:
 1

julia> B = fill(1, (1, 1))
1×1 Array{Int64,2}:
 1

julia> f(A, B)
1×1 Array{Int64,2}:
 1
4 Likes

Thanks for the quick reply!

To ease the frustration for those who come after me, a quick fix:

A=A[:,:] # Array to Matrix conversion
f(A,B)

Only needed on the first operand, works as expected and not pollutes the code too much.

This thread remembers me of this discussion I had. Summarizing: eachrow returns a Vector that is a column, not a row, because dimensions are dropped.

I remember after finding a thread of the Julia founders (I think a issue in the Julia Language Github) where they decided to start dropping dimensions (this was not always the Julia language behavior). I do not remember the exact reasoning anymore, but I remember to think ‘hmm, there is much else involved that I did not think of, for many problems dropping the dimensions is the most elegant/seamless decision’. That said, I think in some cases the code is much more annoying to write because of this behavior. As others said, this kind of type piracy (not all type piracy) is considerable dangerous. However, instead of “ajusting” I would recommend you to consider if it is worth creating a new Array type (a subtype of AbstractArray) that does not possess such behaviour (never drops dimensions), use it in your code, and see if for some cases it is better to work with. It could make an interesting package. We have packages for vectors starting in random/arbitrary offsets, why not arrays that do not drop dimensions when they become unitary.

What a great idea!!!

Found the thread: https://github.com/JuliaLang/julia/issues/4774

Note that it is one of the largest Julia issue threads. It has 417 comments, most of them well written treatises of many paragraphs, spanning from November 2013 to January 2017. I understand your annoyance with this feature, as I share the discomfort, but it was not an unplanned side-effect that arose among distracted language designers, it was seriously discussed, studied, and the current behavior was considered the lesser of two evils.

11 Likes

Yes 4774 is one of those epic issue numbers that many of us have memorized. Beyond the issue itself, it has spun off several talks, a paper, and even a :4774: emoji in honor of its author :4774:.

7 Likes

The idea of a new type seems to get around this by letting people choose for themselves whether or not the potential pitfalls that have been envisaged are relevant to their applications. The addition of a new type would not hurt anyone, as nobody would be required to use it.

1 Like

Yeah, for people interested in this, the way forward would be to create a package with an array type that does not drop dimensions and implement the required methods on that. Then everyone who wants that behaviour just uses that package.

5 Likes

I fear that’d just replace one pitfall with an endless stream of other pitfalls. Significantly better if we could locally shadow indexing like TwoBasedIndices used to do or to use a macro — those would actually work like the folks here want.

But I really encourage folks who are transitioning to Julia to give this some time. I used to think we might need some special syntax to very tersely and efficiently create [i] for this purpose, but this is the first big thread about this behavior in years.

2 Likes

cough my thread had 24 posts and it is from this year cough

What? No, I said nothing.

1 Like

So, just for me to understand better, what you recommend is something more like UnsafeArrays.jl strategy? There you can do:

@uviews A begin
    # inside this block A is replaced by an view object to the outer scope A
    # so changes to this A reflect in the outer A, but making slices of A
    # inside this block do not copy nor allocate, they are just pointers.
end

In this case, inside the block A would be replaced too by a view-like object that just wraps and modifies original A but the wrapper type do not drop dimensions? And when the scope ends the A is not the wrapped object so it already have the dimensions dropped if it was the case.

No, it’d be more akin to the @views macro and would walk through your expression tree to replace all getindex calls with a getindex_preserving_dimensions function — just like how @views swaps in view for getindex.

There’s been a slow but steady effort to make Julia’s syntax more locally overridable — for example in 1.5 you’ll be able to lexically change what gets called upon A' by locally shadowing var"'":

julia> module Test
       const var"'" = x->x^2
       a = 2
       @show a'
       end
a' = 4
Main.Test

This isn’t (yet) done in the same manner for square brackets. One tricky consideration is the fact that square brackets are used for both indexing and array literals.

Thanks for the pointer to your previous thread — I can no longer keep up with all the discussions here!

1 Like

Completely understandable. Also, I do not know if 24 posts is considered long, XD.

Julia 1.5 seems to be very interesting. I am waiting for the non-allocating views that will make UnsafeArrays.jl not necessary anymore.

I proposed the solution of UnsafeArrays.jl strategy just because I found it was a little more accessible to someone with less experience, as the wrapper type does the heavy lifting and wrapper types are a more commonly used/understood concept. The macro doing the heavy lifting has the good side effect of not needing a wrapper type, nor black non-allocating structs magic, basically no runtime overhead, just needs someone that has a little more experience (I, for one, have never written a macro, even if I make widespread use of them).

@compleat, what you said in your 3rd post still catches my eye:

I’d like to make sure, by specific example, that by now you’ve caught the little gem that the existing array behavior serves you well on this:

julia> mat1=randn(3,3); mat2=randn(3,2);

julia> ind1=[1,3]; ind2=[1,2]; mat1[ind1,:]*mat2[:,ind2];

julia> ind1=[1,3]; ind2=[2]; mat1[ind1,:]*mat2[:,ind2];

julia> ind1=[1]; ind2=[2]; mat1[ind1,:]*mat2[:,ind2];

(Note each succeeded without error. i.e. a vector of indices is precisely needed, even if length 1.)

Once you get used to the wisdom of this, you might find it natural to replace [1] with 1:1.

If unconvinced, I hope at least I’ve reached another reader of this thread coming from Matlab. Best of luck-

5 Likes

Hey @mbauman, while hot on the grill maybe you could teach me as a user what I should consider trade-off (if any) between using [i] and i:i. Trivial, yeah.

It occurs to me i:i may consume double the itty bitty space and not even iterate any more efficiently. I don’t know. But I assume they should interact identically with other ranges and functions of vectors, etc, or at least aspirationally. Otherwise just style? I can see [i] might look less readable when inside square brackets itself, as is often the case, but OTOH [i] is appealing in not violating the Don’t Repeat Yourself principle, as in [really_long_name___or_worse___expression].

I’m also striving out of curiosity to understand what you meant by

The biggest distinction between [i] (a Vector) and i:i (a UnitRange) is simply that the Vector is mutable. This has huge implications in both how it behaves and the sorts of optimizations Julia can easily perform. As an example, writing [1] in two separate places will create two separate Vectors because they might change!

julia> A = [1]; B = [1];

julia> A == B
true

julia> A === B
false

julia> push!(A, 2)
2-element Array{Int64,1}:
 1
 2

julia> A == B
false

Every time you create a Vector, Julia needs to be prepared for its value and/or its length to possibly change. More succinctly, every Vector you create has a unique identity. UnitRanges, on the other hand, are immutable. They cannot change their values, so there’s no behavior at all that Julia needs to track. In fact, Julia cannot possibly tell the difference between two separate creations of 1:1. It is entirely defined by the value of its two endpoints.

So, ironically, while it “feels” like 1:1 should take up twice as much space as [1], it actually ends up taking no space at all in many common situations as Julia doesn’t even need to construct it in the first place. Further, the Vector has a header that it uses to track the array’s length and such. And so you end up seeing something like this in practice:

julia> @allocated [1]
96

julia> @allocated 1:1
0

I think x:x that is the most succinct way of currently writing an immutable one-vector, but you’re right it’s not DRY at all. In the future Julia might get smarter about identifying uses of array literals that happen to not need the mutable behaviors and thus give them the same performance, but that’s a harder thing to identify than when it’s promised up front.

6 Likes

Ah yes, I again overlooked the key mutability aspect. I was especially entertaining examples where the [i] is used without explicitly storing, e.g. mat1[[1],:]*mat2[:,[2]], so it’s interesting to hear mutability then too has a strong overhead.

Yes, in terms of DRY I’m myself tempted to write range(i, length=1) instead of i:i when they arise, or of course range(an_expression, length=1) instead of i:i preceded by i=an_expression. Perhaps an infix will someday arise for range(i, length=n) since it may be argued to be as natural as i:j. (And in that case, spelled-out range usages call that infix as they do (:), which would seem to result in a low overhead infix operator.)

Julia getting “smart” about array literals sounds like Matlab; and I don’t mean that flatteringly…

If you use these constructs often, nothing prevents you from defining a function as a shorthand. Eg

range_from(i, n) = range(i; length = n)

or similar, or

singleton_range(i) = i:i

You can of course use a short function name, or possibly Unicode symbols, to make these succinct.

I don’t understand what you mean here; the only array literals I see mentioned in this topic are [] for constructing vectors — nothing special is going on there.

1 Like