Broadcasting across a nested array

But be careful with this approach, as it’s far from general: e.g. if you replace + with * it will not work, and / will give surprising result.

I would say that the technique is general in that it does broadcast the operation, whatever it is (either + or * or /). Of course, said operation has to be defined for the “scalar” operands, which is what the error is about in your examples.

To elaborate on your * example:

julia> Ref([1,2]) .* [[3,4], [5,6]]
ERROR: MethodError: no method matching *(::Array{Int64,1}, ::Array{Int64,1})

does not work because [1,2] * [3,4] does not work. In other words, the * operator is correctly broadcasted, but this yields something which is not well defined. For example, the first item in the resulting array should be:

julia> [1,2]*[3,4]
ERROR: MethodError: no method matching *(::Array{Int64,1}, ::Array{Int64,1})

In order to perform the element-wise product of two arrays, you’d need to broadcast the * operator a first time:

julia> [1,2].*[3,4]
2-element Array{Int64,1}:
 3
 8

Therefore, it is this element-wise * operator which needs to be broadcasted a second time over arrays of arrays operands. One way of doing that is the following:

julia> ((x,y)->(x.*y)).(Ref([1,2]), [[3,4], [5,6]])
2-element Array{Array{Int64,1},1}:
 [3, 8] 
 [5, 12]

which is merely a shortcut for something like:

julia> elemwise_prod(x, y) = x .* y
elemwise_prod (generic function with 1 method)

julia> elemwise_prod.(Ref([1,2]), [[3,4], [5,6]])
2-element Array{Array{Int64,1},1}:
 [3, 8] 
 [5, 12]
7 Likes

Out of curiosity, is there a reason there aren’t standard element-wise operators (+,-,*,/,^, etc.) for arrays so that broadcasting across nested arrays could be more convenient (not require 2 levels of broadcasting)?

I get that matrix multiplication and inversion should be convenient as well, but maybe other operators could have been chosen rather than overloading * and /, as is done in other array-focused languages.

I think it is because, as you said, for * you would not know whether the default semantic should be elementwise or in a matrix way.

Instead of defining by default a semantic which would be surprising for a large number of users, I would tend to think the language designers made the safe choice of erroring out. This way forces users to explicitly define what they want.

As for another operator, I don’t know. Do you have a suggestion in mind ? I actually wouldn’t be surprised if there was a set of UTF8 operators distinguishing between the two operations.

3 Likes

For example, R uses %*% for matrix multiplication, so that * still works for element- wise multiplication of vectors or matrices.

Another language that I use, kdb+/q (an APL-like language for financial data), is similar: matrix multiplication with the mmu operator and elements-wise multiplication with *.

The key to understanding Julia’s design choice is that neither arithmetic operators nor number types get “special treatment”:

  1. * is just an operator like any other. It is usually defined for rings, ie objects for which some notion of multiplication (commutative or non-commutative) makes sense. This includes real and complex numbers, ForwardDiff.Dual numbers, matrices, strings… the list is literally endless since you can always define new methods.

  2. Subtypes of <:Number or <:AbstractMatrix are types like any other. So if you prefer, you can write completely generic code, eg for a power calculation by doubling, without knowing which type you are dealing with — it is enough to know that * is supported.

The broadcasting machinery is designed to take care of your use case. At the moment, you have to use something like the solution by @ffevotte, which may look cumbersome but has the advantage of being transparent and consistent with the language.

Other design choices, eg like R’s, are certainly possible, but they break the generic interfaces which are highly valued by experienced Julia users.

9 Likes

But we do have that distinction! It’s the difference between * and .*. Of course, the problem you’re running into is that you want to do broadcasting down two levels, and indeed we don’t have a syntax for that operation.

While it feels like it’d be nice to support this directly with a syntax, I often find it’s best to simply use a comprehension in these cases. I also think comprehensions are much more readable than broadcasting broadcast-enabled anonymous functions (like ffevotte suggests) or a theoretical syntax that would allow us to throw more dots at the problem.

4 Likes

Vectors, tuples and arrays of numbers form a ring (and field) with elementwise operations as well, and as you say it’s not more or less special than e.g. matrices. It’s also likely that elementwise operations are way more widely used in general than matrix ones. However, in julia matrix operations certainly look like they are treated in a special way.

the list is literally endless since you can always define new methods.

But we cannot do that for elementwise-operations field on arrays, as operators are already defined to mean matrix operations. Numpy approach sounds more reasonable in this regard - there is a general array type as a container with elemenwise operations, and matrix type (which could be a zero-overhead wrapper in julia) with matrix operations.

I am not sure about this. Depends on what you do I guess.

It is unclear what you mean here; in any case they aren’t. Matrix operations are functions like any other functions, eg look at *.

It would be more correct to say that some functions have methods for matrices (where it makes sense), but most functions of course don’t. Again, whether a function has an infix operator form is not really relevant here, they are all functions.

That’s perhaps a matter of taste — if I understand the proposal correctly, I would need to know whether A and B are wrapped to make sense of A * B. Yet in Julia, for many operations I don’t really need to know and I am very happy about that.

It is unclear what you mean here; in any case they aren’t. Matrix operations are functions like any other functions, eg look at * .

Again, whether a function has an infix operator form is not really relevant here, they are all functions.

Of course they are just functions, no question about that. My point is a lot of operations which are defined for arrays in Base have the matrix meaning, which (as you say) is just one possible field formed by arrays. Btw, only by 2-dimensional arrays, as opposed to elementwise operations forming a field for arbitrarily sized arrays.

That’s perhaps a matter of taste — if I understand the proposal correctly, I would need to know whether A and B are wrapped to make sense of A * B . Yet in Julia, for many operations I don’t really need to know and I am very happy about that.

When one typically works with just arrays without any matrix structure in them and occasionally needs matrix operations, this makes a lot of sense. As opposed to a / b for two vectors giving a 2-dimensional array.

A choice had to be made, and you have the elementwise form available with very convenient syntax that is also consistent with the rest of the language. What’s not to like? :sunglasses:

I don’t know if you are aware of this, but these issues were considered very carefully in the design of the API of what eventually became LinearAlgebra. If you are interested in the discussions, search for closed issues on “taking seriously” and follow the pointers (and prepare to spend about a day reading :wink:).

Not quite sure what exactly you mean by convenient syntax, in the context of this thread: ((x,y)->(x.*y)).(Ref([1,2]), [[3,4], [5,6]])? As opposed to just Ref([1,2]) .* [[3,4], [5,6]] if arithmetic was elementwise by default.

I don’t know if you are aware of this, but these issues were considered very carefully in the design of the API of what eventually became LinearAlgebra . If you are interested in the discussions, search for closed issues on “taking seriously” and follow the pointers (and prepare to spend about a day reading :wink:).

Sure, I saw a few of those threads (read only parts of them). But reading and understanding arguments there doesn’t mean I agree with the choice (of course, nothing can be changed here in julia). The final choice seems to be very matrix-centric, even though matrices don’t even appear in a large number of programs and scripts (I’m quite sure it’s true for the majority of programs; less so for scientific programming, but even there matrices are far from always needed).

This is not true — if you came up with a coherent proposal in the form of a PR that improves things significantly, I am sure it would be given due consideration for 2.0.

Note, however, that the bar for major overhauls is very high at this point, and requires a much more convincing argument than just citing isolated syntax examples from other languages. This is because all of these syntactic and semantic elements of the current API form a (more or less) coherent whole, and changing just one thing has a lot of ramifications. This is why a PR is the preferred way of arguing for something: when you do all the changes to make the tests run, you need to tackle problems that are easy to miss in a discussion.

An alternative (and less costly) route would be writing a package that defines your desired semantics for a couple of operators (I am sure you could find some nice Unicode ones). This would allow you to experiment, and if others find your solution an improvement, you could find it much easier to argue for a change in Base.

Matrices are pervasive in scientific programming, and Julia is a scientific programming language.

I’m certainly not an expert in language design, and unlikely to propose some solution which is better than the current one and still fits the overall existing ecosystem. Judging as a language user, however, I’ve definitely experienced and seen (among my colleagues trying julia) confusion related to this “priority of matrices”. E.g. arithmetics (this thread), absence of ' (transpose) operator for non-numeric 2d arrays.

Even though we mostly (with julia) do computing for science, matrix operations still appear rather infrequently. And where do you get the “Julia is a scientific programming language” line from? The homepage seems to promote it as a general purpose, fitting various areas (of course scientific computing is an important one of these, and mentioned at the homepage among others).

Acting on arrays, having both * and .* available is really all you can ask for. Defaulting to * really is correct for matrices, and people do use them a lot; defaulting to .* may make sense if your language is less general-purpose, more focused on unstructured lists of numbers. But it’s one more pixel! And the fact that . fuses is often very useful, 2 .* x .+ y becomes one operation.

Acting on arrays of arrays (of arrays!) there are more possible options. This is less common in Julia than some places, as there are proper N-dimensional arrays. The one you are suggesting, of element-wise to the 2nd level when that exists, isn’t a built-in symbol. But you can easily make one, here’s a crude first go:

julia> +₂(x, y) = .+(x, y)

julia> +₂(x::AbstractArray{<:Number}, y::AbstractArray{<:AbstractArray}) = Ref(x) .+ y # and the reverse...

julia> [10 100] +₂ [5, 5]
2×2 Array{Int64,2}:
 15  105
 15  105

julia> [10, 100] +₂ [[5, 5], [6, 6], [7, 7]]
3-element Array{Array{Int64,1},1}:
 [15, 105]
 [16, 106]
 [17, 107]

Here’s the list of allowed unicode symbols which are like + instead of decorating.

I also wish there was a shorter way to write permutedims, there have been various suggestions.

As I wrote above, what you perceive as “priority of matrices” is in fact just a logical consequence of treating * as a general operator on rings. Programming with generic constructs is baked into the design of Julia from the beginning. The same applies to ' (and perhaps you missed the single-argument method for permutedims, which does the non-recursive “transpose”).

Again, if you truly hate these things, they are very easy to override for your own use.

But if you (and your colleagues) just find them surprising, just consider the possibility that they have been designed rather carefully and do form a coherent whole, and in this sense are quite elegant. Just give it some time.

1 Like

No, it is a consequence of choosing the “matrix-field” for arrays as the default instead of “elementwise-field”. It would be just as consistent with the other choice of default, and some unicode operator for matrices instead of * and /.

But we already have a way to do elementwise operations via broadcasting, which has other advantages.

What would be the point of a parallel syntax that does not benefit from fusion?

I think what that means is that the current way of doing things in Julia allows having a convenient syntax to broadcast any operation.

If you specifically want to define elementwise operations. Then broadcasting may help you for the first level. For other levels, you can still apply broadcasting to broadcasted operations (which you then have to define when they don’t exist, and is arguably more cumbersome way). But, as mentioned above, there are also other ways of doing things, for example using comprehensions:

julia> [[1,2].*y for y in [[3,4], [5,6]]]
2-element Array{Array{Int64,1},1}:
 [3, 8] 
 [5, 12]

Another point should probably be made here. Elementwise operations do exist in Julia. If you want to perform the elementwise product of two arrays of the same arbitrary size, you can always use… the broadcasted * operation!

What we are speaking of here is not an elementwise operation, since operands don’t have the same shape. It happens that what you wanted in this very specific case could have been more easily implemented by a combination of broadcasting and elementwise product if such a product had been available with a specific name (as opposed to the broadcasted use of a generic *operator).

Now, there are different ways of interpreting the “elementwise” product of [1,2] and [[3,4], [5,6]]. One is, like above, to think of [1,2] as a scalar that you multiply (elementwise) in turn to [3,4] and [5,6]. Another way could be to consider [1,2] and [[3,4], [5,6]] both as vectors of size 2 and perform their elementwise product. The first element of the resulting array would then be the “elementwise” product of 1 and [3,4] (where 1 would have to be considered as a scalar and multiplied elementwise to both elements of [3,4]):

julia> [x.*y for (x,y) in zip([1,2], [[3,4], [5,6]])]
2-element Array{Array{Int64,1},1}:
 [3, 4]  
 [10, 12]

If elementwise product was the default in Julia, then what would/should be the correct interpretation of:

[1,2] .* [[3,4], [5,6]]

My answer would be that there is no correct definition of the above, which is why it errors out (as it should). And then we’re back to the conclusion that in order to unequivocally define what you want to do, you have to mix broadcasting (where you have some control over what gets considered as a scalar) and “plain” elementwise operations (where all operands have to have the same shape) on multiple levels. I’m not sure it can be reasonably expected to find a simple syntax allowing to easily describe the combinatorics of possible interpretations when the number of levels increases.

2 Likes

Thank you for this extensive explanation of reasoning! I agree with the difficulties and ambiguities you mention, and plain arrays of plain arrays are probably not the most common thing anyway. However arrays of staticarrays are more common, and while a .+ b works fine for such cases, seemingly-same a ./ b does not. And even understanding the reasons of why it is made so does not help with significantly more verbose syntax and unintuitive failure modes. E.g. a ./ b executes successfully, it’s only somewhere later things go wrong: in case of a <: Array{SVector} when a function expecting a vectors gets a matrix, and in case of a <: Array{SArray{Tuple{3, 3}}} actually never! Nothing throws an error, only the end result differs.