Broadcasting across a nested array

I think what that means is that the current way of doing things in Julia allows having a convenient syntax to broadcast any operation.

If you specifically want to define elementwise operations. Then broadcasting may help you for the first level. For other levels, you can still apply broadcasting to broadcasted operations (which you then have to define when they don’t exist, and is arguably more cumbersome way). But, as mentioned above, there are also other ways of doing things, for example using comprehensions:

julia> [[1,2].*y for y in [[3,4], [5,6]]]
2-element Array{Array{Int64,1},1}:
 [3, 8] 
 [5, 12]

Another point should probably be made here. Elementwise operations do exist in Julia. If you want to perform the elementwise product of two arrays of the same arbitrary size, you can always use… the broadcasted * operation!

What we are speaking of here is not an elementwise operation, since operands don’t have the same shape. It happens that what you wanted in this very specific case could have been more easily implemented by a combination of broadcasting and elementwise product if such a product had been available with a specific name (as opposed to the broadcasted use of a generic *operator).

Now, there are different ways of interpreting the “elementwise” product of [1,2] and [[3,4], [5,6]]. One is, like above, to think of [1,2] as a scalar that you multiply (elementwise) in turn to [3,4] and [5,6]. Another way could be to consider [1,2] and [[3,4], [5,6]] both as vectors of size 2 and perform their elementwise product. The first element of the resulting array would then be the “elementwise” product of 1 and [3,4] (where 1 would have to be considered as a scalar and multiplied elementwise to both elements of [3,4]):

julia> [x.*y for (x,y) in zip([1,2], [[3,4], [5,6]])]
2-element Array{Array{Int64,1},1}:
 [3, 4]  
 [10, 12]

If elementwise product was the default in Julia, then what would/should be the correct interpretation of:

[1,2] .* [[3,4], [5,6]]

My answer would be that there is no correct definition of the above, which is why it errors out (as it should). And then we’re back to the conclusion that in order to unequivocally define what you want to do, you have to mix broadcasting (where you have some control over what gets considered as a scalar) and “plain” elementwise operations (where all operands have to have the same shape) on multiple levels. I’m not sure it can be reasonably expected to find a simple syntax allowing to easily describe the combinatorics of possible interpretations when the number of levels increases.

2 Likes

Thank you for this extensive explanation of reasoning! I agree with the difficulties and ambiguities you mention, and plain arrays of plain arrays are probably not the most common thing anyway. However arrays of staticarrays are more common, and while a .+ b works fine for such cases, seemingly-same a ./ b does not. And even understanding the reasons of why it is made so does not help with significantly more verbose syntax and unintuitive failure modes. E.g. a ./ b executes successfully, it’s only somewhere later things go wrong: in case of a <: Array{SVector} when a function expecting a vectors gets a matrix, and in case of a <: Array{SArray{Tuple{3, 3}}} actually never! Nothing throws an error, only the end result differs.

Because addition of same-shape arrays is well-defined. Division of arrays means some kind of inverse of matrix multiplication, which is not defined for all shapes.

No, it’s always doing what you ask it to do, which is that . broadcasts the given operation, based on the outermost containers. If that operation isn’t what you want done to their contents, then it can’t guess what you do want. Not trying to guess is precisely how it avoids having hard-to-predict failure modes.

If you want a different operation to be performed, then you must ask for this. But there are many possibilities, you must choose.

Once you’ve chosen, you can have very concise syntax, at the slight cost of not being ASCII. You can make a package so that for ever after, it’s one line using SecondLevel.

Apologies if this sounds harsh, but this actually an exceptionally well-thought out piece of the language. It’s not done this way because of some hardware limitation in 1973, or some tribal dispute lost in the mists of time. It’s very flexible, and allows lots of people to do lots of different things, predictably, legibly, and fast.

3 Likes

Sure. Division also is, in exactly the same elementwise way as addition.

For those who typically do matrix calculations? Yes, I guess. For everyone else? I really doubt.

But the thing is - it does not avoid them! See my previous post for example: how should one (having experience with other languages, eg python or R) guess that even though a .+ b does the expected broadcasting all the way down, a .* b gives results of the same shape (i.e. no errors anywhere) but completely different content (in case of Array{Array{2}})? Note I’m not talking about language designer perspective, only from the user’s point of view.

And regarding division specifically - even you say “some kind of inverse” suggesting it is not obvious. Indeed, I cannot even find in the docs what it does for vectors! Docs say Right division operator: multiplication of x by the inverse of y on the right. Ok, do we have vector inverses? No: inv([1, 2]) says no method. However, [1, 2] / [1, 2] gives some result…

UPD: actually, thinking about this I even can see a possible solution to the confusion. If all those matrix operations are defined in LinAlg, then maybe just don’t import (using) it by default and require explicit action from the user? Then all the arithmetics on 1/2-d arrays will just fail instead of returning some (unexpected to those not doing matrices at this time) result.

Incidentally, both * and eg Matrix are defined in Base, so strictly speaking LinearAlgebra is engaged in type piracy :wink:

Division of arrays means some kind of inverse of matrix multiplication

For those who typically do matrix calculations?

No, for Julia! I’m telling you how it thinks. This is what * and / and \ mean. Now you know. It’s not that confusing.

There are obviously other ways to define a language, there are even some non-crazy ways in which * is always scalar multiplication. Those are other languages. At this point, wishing that * was a different concept is like wishing Russian didn’t have gender or cases, for your ease of learning. It does.

I cannot even find in the docs what it does for vectors! Docs say Right division operator: multiplication of x by the inverse of y on the right.

That’s what it is: x*inv(y). In the case of matrices, inv is the matrix inverse. (And / is computed another way, which is numerically superior). A vector space does not have the notion of an inverse, thus inv(::Vector) does not exist. (But inv.(list) is obviously fine.)

Indeed the help could probably be more verbose, as it is for \, and if you’d like something good to come of all this, please make a PR adding a sentence.

1 Like

So what does [1, 2] / [1, 2] compute then? :slight_smile: [1, 2]*inv([1, 2]) doesn’t work, of course.

Some things have a left-inverse without having an inverse:

julia> M = [1, 2] / [3, 4]
2×2 Array{Float64,2}:
 0.12  0.16
 0.24  0.32

julia> M * [3, 4]
2-element Array{Float64,1}:
 1.0
 2.0

(Or maybe it’s a right-inverse, I never know the names, but the notation is clear.)

Ah, OK, sure. After this case with a ./ b giving unexpected results happened to me, I looked up the reason and found how it works. Luckily both were Array{Array{1}} in my case, so the shape of the result was different than I expected, otherwise I would have probably spent much more time finding the cause of my code giving wrong results.

But this (“how julia thinks”) is not arbitrary or given by gods or … . And as it turns out, some parts of stdlib (happily, not the language itself) are tailored to those doing matrix operations - so far so good. However, as linear algebra is only one from many many areas language can be used in, it would make sense not to include the corresponding functions by default. This would eliminate the confusion that some simple functions work in an obviously correct way for basically everyone (e.g. +, -) while some don’t (e.g. *, /). If a ./ b errored not only for Array{Array{1}}, but also for Array{Array{2}} this certainly would make more sense to those not thinking of linear algebra when writing a completely unrelated piece of code.

Some things have a left-inverse without having an inverse:

Sure, thanks. This is only a minor doc issue then :slight_smile:

How Russian works is also not arbitrary or given by gods.

But there’s no way / is going to become a strange operator whose meaning changes wildly if you load LinearAlgebra. Avoiding such nonlocal changes is why piracy is a crime.

If you think you may trip over this repeatedly, another thing you could define is a multiplication with an error:

julia> *′(x::Number,y::Number) = *(x,y)

julia> *′(x::AbstractMatrix, y::AbstractVecOrMat) = error("*′ expects scalars, it looks like you used it for matrix multiplication again!")
*′ (generic function with 2 methods)

julia> [rand(2,2), rand(2,2)] .*′ [rand(2)]

In one line of MacroTools.postwalk you can then make a macro which replaces every * with *′. And with zero performance penalty.

1 Like

As a Julia beginner with an R and python background, I also found Julia’s seeming bias towards matrix operations to be one of the few surprises in an otherwise easy and intuitive learning experience.

In my work, I tend to use something like Vector{SVector{N,Float64}} at least 10x more often than Matrix{Float64}, so the reason for this bias wasn’t obvious to me, but at this point I take the claim that the design decision leads to a more elegant type system at face value.

However, even if it is a better design, it doesn’t change the fact that it is a stumbling point for what I would guess is a large number of newcomers.

For me, I tried things like rand(2) + rand(2) and thought “Oh, that’s nice. It seems there is implicit broadcasting.”, but then was confused by why rand(2) / rand(2) didn’t work as expected. I had the same experience with rand(2) / 0.3 working, but rand(2) - 0.3 not working.

I think the idea of adding a warning in the docs for beginners could help, but it’s still likely to be a gotcha that many beginners will just need to learn the hard way.

1 Like

Maybe the documentation chapter about Noteworthy Differences from other Languages could be made more prominent…

In particular, its section devoted to differences between Julia and R discusses most stumbling points mentioned in the present discussion. For example:

  • Like many languages, Julia does not always allow operations on vectors of different lengths, unlike R where the vectors only need to share a common index range. For example, c(1, 2, 3, 4) + c(1, 2) is valid R but the equivalent [1, 2, 3, 4] + [1, 2] will throw an error in Julia.

  • Julia’s * operator can perform matrix multiplication, unlike in R. If A and B are matrices, then A * B denotes a matrix multiplication in Julia, equivalent to R’s A %*% B . In R, this same notation would perform an element-wise (Hadamard) product. To get the element-wise multiplication operation, you need to write A .* B in Julia.

  • Julia performs matrix transposition using the transpose function and conjugated transposition using the ' operator or the adjoint function. Julia’s transpose(A) is therefore equivalent to R’s t(A) . Additionally a non-recursive transpose in Julia is provided by the permutedims function.

I haven’t seen any mention of /, though. If, as users with an R background, you feel like this would be valuable information for others, then maybe you could propose a PR on this matter?

6 Likes

FWIW, I think the best way to learn Julia is reading through the part preceding the base docs (“Manual”) first. It takes time, and of course no one is expected to remember all the details from one reading, but it illustrates what is out there so one can find it later.

Trying to figure out patterns “experimentally” can lead to surprises like the one above. Some people get over that quickly, but in some cases that triggers a “let’s redesign Julia so that it works like the previous language I used” post.

2 Likes

That R section is pretty good! Maybe it just needs a bullet point to explain that / and \ are cousins of *, while ./ is like .*.

I see that the Python section doesn’t mention such things, nor the other important difference with /:

>>> np.array([6,7,8]) * np.array([1,2,3])
array([ 6, 14, 24])
>>> np.array([6,7,8]) / np.array([1,2,3])
array([6, 3, 2])
1 Like

I’m really not sure why there is so much opposition to my suggestion to just don’t include LinearAlgebra-related operators and functions into the namespace by default. For people coming from other languages and seing that a + b works just the same as a .+ b the obvious conclusion is “automatic broadcast” or something similar. So then they try a * b or exp(a) or …, and get reasonably-looking result at the first glance (same shape and type), but completely unexpected value which may remain unnoticed for quite some time. If it was MATrix LABoratory then this bias to matrices would be completely expected, but not for general purpose or even general-scientific-computing-purpose language.

Actually, interesting to note that even currently not all reasonably-applicable-to-matrices functions in base actually work for matrices, even aritmetic: one of examples is cis().

If you think you may trip over this repeatedly, another thing you could define is a multiplication with an error:

Yes, and also division, exponentiation, trig functions and so on. Ah, and use *' everywhere instead of *.

Btw, how do you know that one can append ' to a symbol and still get valid operator name? I don’t use non-ASCII in my code for obvious compatibility reasons, and this limited my ability to implement new operators - but this ' symbol is certainly helpful.

It’s more than slightly off-topic, but I’m curious: what are the “obvious compatibility reasons” that keep you from using non-ASCII characters? They aren’t obvious to me, and I haven’t yet run into them myself.

1 Like

The main issue is quite a few of server terminals I work with have (sometimes different) issues with unicode support. So I cannot rely on being able to type or even correctly see various “advanced” unicode characters.

Even without that, it’s not easy to type those unicode symbols in a text editor supporting unicode, but not specifically tailored to julia and corresponding autocomplete - even jupyter text editor does not support them. I don’t see any reason for me to avoid such editors instead of just avoiding unicode as in any other language.

1 Like

Then that’s the wrong conclusion. It’s like saying that the result of 0 + 0 and 0 * 0 are the same, so + and * must be the same thing.

Frankly, I don’t understand the basis for this kind of argument at all. Julia has a manual which describes the language and the standard libraries. People who want to use Julia should just read it, or be prepared to encounter difficulties. It is unreasonable to expect that a computer language can be learned by generalizing from experiments.

If you need this functionality, perhaps consider opening an issue.

I’m really quite confused as to why you’re so adamantly opposed to our current setup… and what your proposed improvement would be. Note that * and + are not automatically broadcast enabled for matrices. Unlike broadcast, + only works when the shapes are identical. And unlike broadcast, * only works for compatible matrix/vector products or scalar multiplication.

Is it really an improvement to take away * and + and / from the behavior of matrices when they’re well-defined operations?

1 Like

It would heavily limit the number of julia (or any other language) users if everyone was require to read the whole manual just to use the language.

Is there any other language where + and * are the same? There are a lot of languages where a + b broadcasts. It sounds strange to dismiss previous experience completely.