Why are there broadcast versions of functions?

question

#1

i was curious as to why the ‘.’ versions, abs., log10., etc… have been created ?

doesn’t the fact that the functions are methods and can specialize on their argument make this unnecessary, i.e. what was wrong with the non ‘.’ versions ?


#2

I think you misunderstand. There is no “broadcast version”, there is only broadcast. f.(...) is lowered to broadcast(f, ...). Please read the manual.


#3

The . is now just a shorthand for broadcast / broadcast!. If you create a function foo(x::Int) = 3 * x, then foo.([1, 2, 3]) will automatically work and return

3-element Array{Int64,1}:
 3
 6
 9

Running @which foo.([1, 2, 3]) will print

broadcast(f, A, Bs...) in Base.Broadcast at broadcast.jl:434

showing that the broadcast function is being called. So it’s not that abs., log10., etc. have been explicitly created, it’s just that the dot syntax is a language feature that applies to all functions.


#4

Obligatory read:

https://julialang.org/blog/2017/01/moredots

Essentially by making it a part of the syntax, many optimizations can occur to get rid of temporary variables and speed it up (without runtime magic).


#5

It also eases the load on the developer. Instead of having to create a special method of their function for collections (and perhaps even special versions for different collections), they can just rely on broadcast to handle it for them, in a uniform and well-defined way.

The user also doesn’t have to wonder whether a collection version exists, (because sometimes it does and sometimes the developer just didn’t think of it). Just use dot broadcast.

Some functions should actually have separate different behaviour for collections. Let’s say you develop a method foo(x, y) that essentially multiplies x and y. If someone inputs two 2-dimensional arrays instead of two numbers, how should your code handle that? Perform elementwise multiplication or matrix multiplication? You need to decide in your code, and let’s say you decide to use matrix multiplication.

But what if the caller actually wants the elementwise operation? Without broadcasting, they’ll have to write a loop. With broadcasting, the developer does not need to think about the elementwise behaviour. The caller just writes foo.(X, Y), and automatically get what they want.


#6

There’s also the eternal question of “should this function be a broadcast function or not?” There’s no good, consistent answer, just a random division of functions that are and functions that aren’t. The f.(x, y) calling scheme allows the caller to choose, which is the correct place for the choice to be made.


#7

No, because you may want to define a method for arrays that isn’t simply the “broadcasted” version of the “scalar” method.

For example, currently expm is the function to compute the matrix exponential, which is a substantially different operation than the “broadcasted” version of the scalar method. There is an ongoing debate whether rename it to exp (and I’m personally in favor of this change), so that

  • exp(x), with x scalar, gives the exponential of the number x;
  • exp(A), with A matrix, gives the matrix exponential of A;
  • exp.(A) gives the element-wise exponential of the matrix A.

Thus, using a different syntax for broadcast than the simple function call and leaving the freedom to define different methods for arrays that aren’t the broadcasted version of the scalar method is a good thing.


#8

aha! exactly what I was looking for. thanks :slight_smile:


#9

For a different reason, I find myself asking the same question for this situation: suppose that a long precomputation is needed:

function f(x::Number, y::Number, z)
    c = long_precomputation_of_parameters(z)
    compute_something.(x, y, c)
end

For this situation, short of having to awkwardly call compute_something.(x, y, c), it makes more sense in my opinion to have only the version taking vectors as input.


#10

I’m confused about what the issue is here – is it an objection to the syntax?


#11

I am not sure you are replying to my post, and I realize that the post was hastily written.

No, the syntax is not the problem. I am instead struggling to find the correct signature of a function that requires a long set up.

I apologize in advance because the specific case I am going to describe is slightly different from the one outlined above: I need to evaluate series of fully normalized Legendre polynomials which is usually done by the Clenshaw algorithm: the coefficients \alpha_k and \beta_k in the article contains square roots.

If I use the dot notation, I need to compute the coefficients for each input element; on the other hand, using the vectorized version, the coefficients are computed once for input vector. (If you are going to suggest to cache the coefficients, this is what I do when I use Float64 only.)

What I wanted to say is that there is another class of problems for which the vectorized version may be more efficient than the dot version.

Does all this make sense?


#12

Normally, the Julian solution for this sort of thing is to construct an object that encapsulates your precomputation. For example, you have a constructor Legendre(data) that creates a Legendre object that stores all of the coefficients etcetera that you need to evaluate the interpolant efficiently. An L = Legendre(data) object would be callable, so you could then do L(x) to evaluate the interpolant at x, and L.(x) to evaluate it efficiently for an array. Or, more compactly, Legendre(data).(x) would efficiently construct the coefficients (once) and then apply them to an array, without requiring you to ever define a specialized “vectorized” method.

For real-world examples of this, see the Fun object in ApproxFun.jl or the various InterpolationType objects in Interpolations.jl.

Then, for example y .+= sqrt.(Legendre(data).(x .- 1) .* 2) would fuse into a single in-place assignment loop that constructed L = Legendre(data) once and computed y[i] += sqrt(L(x[i] - 1) * 2) in-place for each element of two arrays x and y.


#13

Sorry for reviving an old thread, but something linked here which made me read this and got me curious

I am confused. Is this something introduced in 0.7? I looked at the result of expand for your example expression and got suspicious by its output, so i tried the following mock imlementation:

julia> struct Legendre
           data
           Legendre(data) = (println("foo"); new(data))
       end

julia> (L::Legendre)(x) = L.data * x

julia> x = rand(5) .+ 10; y = zeros(5);

julia> y .+= sqrt.(Legendre(2).(x .- 1) .* 2)
foo
foo
foo
foo
foo
5-element Array{Float64,1}:
 6.3153 
 6.03635
 6.32346
 6.2137 
 6.04248

EDIT: the above is 0.6 but I just tried a couple days old 0.7 branch on my second machine and it gives the same output. Am I doing something subtle that prevents the struct to be created just once? I also tried using a function barrier as well as adding a ::Int to the data field (just in case). both without success.


#14

A little more compact example of the above issue:

struct Bar
    data
    Bar(data) = (println("foo"); new(data))
end
(B::Bar)(x) = B.data * x
Bar(2).([1,2,3]) ##Bar(2) will be only evaluated once
broadcast(sqrt, Bar(2).([1,2,3])) ##Bar(2) will also be evaluated once
sqrt.(Bar(2).([1,2,3])) ##Bar(2) will be evaluated three times

#15

No, my mistake. Legendre(2).(x) corresponds to broadcast(Legendre(2), x), which calls Legendre(2) once, but sqrt.(Legendre(2).(x)) turns into broadcast(x -> sqrt(Legendre(2)(x)), x), which calls Legendre(2) multiple times. You need to do L = Legendre(2) separately, then do sqrt.(L.(x)), to ensure that the constructor is called only once.

It would be possible, in principle, to change this so that sqrt.(expression.(x)) turns into let tmp = expression; broadcast(x -> sqrt(tmp(x)), x); end. I haven’t really thought through what implications this would have, though. In any case, it would be (in principle) a breaking change that is probably too late to get into Julia 1.0. Or maybe this could be viewed as a bugfix?


#16

I always thought of nested broadcasts as calculating non-broadcasted values once (not sure why though, it just seemed like good design so I assumed it), so I would think this is a bug. It would be nice to have this as a property one could rely on.


#17

File an issue on GitHub if that’s a problem for you. Since it’s about a very specific case, it could be considered a bugfix and applied even after the feature freeze (but before 1.0).


#18

Opened an issue:

@Non-Contradiction: I used your nice compact example, but could not give credit on github since I don’t know your username there, just PM me if you want me to add it.