Not a great example

My two cents about an example by @Elrod:

I just happen to see this example but the original post has been closed by @StefanKarpinski . As I saw many vips in this forum like this particular example I feel a little confused because the R code is not really a good one.

  1. No need to use sapply. Given a 1-row matrix b,
matrix(sapply(b, function(x) rep(x, 4)), nrow = 4)

is just same as matrix(rep(b, each = 4), nrow = 4).

  1. IMO, the quoted example just shows the operator ^ in R-base is not vectorized. I don’t know the exact reason but one can easily overwrite the default if he/she need do matrix ^ matrix element-wise.

Please don’t get me wrong: I don’t want to reopen the original post, nor compare broadcasting in Julia with R. I come from R but would like to spend time on learning Julia.

1 Like

Sure, but the argument being made was that broadcasting is generally useful and is extremely convenient in Julia through the dot-syntax. If you have to define a new function each time you want to operate elementwise you have to basically double the number of functions you need to define.

4 Likes

Sure - but if comparing to R and Python code, it’s pretty important to use code that is ideomatic and terse in that language.

1 Like

Your solution also requires the construction of an interim matrix (the matrix(rep(b, each = 4), nrow = 4)), which could be very costly when the dimension is large, compared to the solution in Julia where the ' just tells the broadcasting mechanism to use a different traversal.

That said, direct comparisons with R are somewhat tricky because under the hood, all operations in R are on vectors, as R has no scalars (I don’t want to get into corner cases here):

> is.vector(1)
[1] TRUE

So for an R user, the difficult question is not how to get what is called broadcasting in Julia (everything is broadcasted all the time), but

  1. how to escape broadcasting and work with “atoms”, or broadcast over collections of non-scalars,

  2. how to do different kinds of broadcasting, eg

    julia> (1:4) .* (1:4)'
    4Ă—4 Array{Int64,2}:
     1  2   3   4
     2  4   6   8
     3  6   9  12
     4  8  12  16
    

    (for this particular one, R has outer)

  3. how to make user-defined functions fast, ie avoid interim allocation

Also, please keep in mind that comparisons to R are not meant in an unkind way. Many people here worked a lot with R, and grew to like it with all its quirks.

The design of Julia benefited a lot from the lessons learned in R (among other languages), and not having to be compatible with earlier versions of itself (unlike R, with the of the enormous amount of code available for it) allowed a lot of opportunities to improve on the language design.

6 Likes

I don’t want my opinion be an unkind one and I’m really looking forward to learning from Julia experts here.

3 Likes

R is probably the language I know second best, but my R is no where near as good as my Julia.

I appreciate your feedback, and agree that this looks better than what I had posted:

A <- matrix(runif(4*8), nrow = 4)
b <- t(runif(8))
c <- runif(4)
D <- exp(A ^ rep(b, each = 4)) - log(c)

where the last line was

D <- exp(A ^ matrix(sapply(b, function(x) rep(x, 4)), nrow = 4)) - log(c)

However, I still think it’s advantage when the syntax can convey all the information, avoiding the need for additional function arguments.

1 Like

Just found that the matrix() is not necessary.

D <- exp(A ^ rep(b, each = 4)) - log(c)

I still think it’s a great example—we’ve had to have a whole thread on what the best way to write this example in R would be, whereas the Julia code is completely clear (once you know how broadcasting works) and hasn’t required any tweaks. The point is not to throw shade at R but to consider why: why does the way Julia handles this lead to less confusion about how to express this? Why doesn’t it require any guessing or weird extra keywords, and just “does the right thing”?

We didn’t get this nailed down immediately: Julia started out doing things exactly the same way as R, Matlab and Python—with built in vectorized versions of an essentially arbitrary set of functions. But this was never really satisfactory. How do you know if a function should be vectorized or not? A first cut criterion is asking “Does it seem like someone might want to apply it to arrays?” But that doesn’t really help since one wants to apply every function to an array sometimes. Another criterion might be to exclude from vectorization only functions that have a conflicting potential meaning for certain kinds of arrays that we might want to allow. Multiplication of matrices with the right kind of dimensions is a case where one wants to do something other than elementwise. But there are many others. This approach would make the language really hard to use: it does elementwise operations almost everywhere but not quite—and you just have to memorize the exceptions. Yuck. Also, if we were doing that, why do you have to explicitly define all those extra methods for vectorized cases? Shouldn’t they be provided automatically? All in all it’s just a mess. But that mess is the status quo in other systems. I can understand how they got there, having been there ourselves, but surely there’s a better way to do this.

Broadcasting is Julia’s attempt to provide a more principled and consistent approach. The example highlights the benefit of that consistency: it’s perfectly clear how to do this operation that has stumbling blocks and inconsistencies and requires weird keyword arguments and a discussion thread about how to write it in other systems.

17 Likes

I just went through one of the very first packages I wrote in Julia, and I remember how silly it used to be with the vectorize_1arg (? I don’t even remember if that is the exact name), and then later writing a bunch of trivial list comprehension methods to cover “vectorizing” some function I wrote.

Dot broadcasting is one of those things that once it was released made so much sense, you wonder why it hadn’t been implemented earlier.

2 Likes

Yes, it was these awkward and unofficial but widely used @vectorize_1arg and @vectorize_2arg macros which generated vectorized methods for a function in terms of the scalar function. It was pretty bad. But that stuff is in there in all systems that have that kind of vectorization.

1 Like