Efficient Broadcast of Heterogeneous AbstractArrays?


I have a question about broadcasting on an AbstractArray. Easiest to explain by example. Lets say I have a type

type MyArray{T,T2}
A = MyArray((rand(5),rand(5)+im*rand(5)))

where I define an index as walking through A.x[1] and then through A.x[2]. If I wanted to inplace update some B of the same type, I would do something like

for i in eachindex(A.x[1])
  B.x[1][i] .+= A.x[1][i]
for i in eachindex(A.x[2])
  B.x[2][i] .+= A.x[2][i]

In this way of iterating, the heterogeneous array is iterated through type-stably.

Is there any way to make B .+= A do that kind of indexing?


Since B .+= A lowers to broadcast!(+, B, B, A), then by overriding broadcast! you can make it do whatever you want.

If you want it to handle lots of different combinations of MyArray, ordinary arrays, scalars, etcetera, it is possible but compilicated. The broadcast function is implemented in pure Julia, but is one of the most intricate functions in Base. One of the goals for future versions is to make it easier to re-purpose this machinery by overriding functions like Broadcast._broadcast_getindex etcetera.


But how do you handle broadcast fusion? Wouldn’t you have to make this for the anonymous functions that it generates?


What is the reason for putting the two vectors in a tuple, instead of two fields?
I would think that the compilation would be easier with two fields, but I’m not sure if the generated code would be any different (if the first subscript is always constant, as in these examples)


Don’t know how many vectors.


Yes you would, if I understood your concern correctly. But its easy to do that, you just define broadcast!(op, B, B, A) and then pass op along to the broadcast calls you would have in that function body that forward the broadcast to the underlying data.


But then I need to generate the function. There can be an arbitrary number of arrays in B.x, and the arguments on the other side can mix numbers and other MyArrays (which match size on each .x though). I am looking to broadcast things like this:

Sounds like a job for:

@generated Base.broadcast!(op, A::MyArray, B...)


I guess I can use it for for i in eachindex(A.x) do a broadcast! on A.x[i]


If you want to handle an arbitrary mixture and number of array/scalar arguments, but the arrays are all the same shape, you have three options:

  1. You could dig through base/broadcast.jl to see which functions you would need to override to re-use the broadcast machinery with a new type. (e.g. there is a _broadcast_getindex function you can override to use your own indexing mechanism, etcetera). This will hopefully get cleaned up and documented in the future.

  2. You can use a @generated function. In the case where the you have vectors that are all of the same length, this is actually pretty easy. I assigned implementing this as a homework problem in our performance-optimization class in January 2017, and you can see the solutions here (problem 2): https://github.com/stevengj/18S096-iap17/blob/master/pset2/pset2-solutions.ipynb

  3. Since the number of arguments N is a compile-time parameter, I think you can use ntuple tricks for this; I tried to do this in my solutions in January, but I ran into performance problems because I didn’t realize that I needed an @inline tag (https://github.com/JuliaLang/julia/issues/20196); in hindsight, I think this could be made to work. @generated seems easier, though, and is how Base.broadcast is implemented.


I think I got it:

using RecursiveArrayTools

A = (rand(5),rand(5))
p = ArrayPartition(A)
B = (rand(5),rand(5))
p2 = ArrayPartition(B)

add_idxs(x,expr) = expr
add_idxs{T<:ArrayPartition}(::Type{T},expr) = :($(expr).x[i])

@generated function Base.broadcast!(f,A::ArrayPartition,B...)
  @show B
  println(@which (add_idxs(B[1],:(B[1]))))
  exs = ((add_idxs(B[i],:(B[$i])) for i in eachindex(B))...)
  @show exs
  res = :(
  for i in eachindex(A.x)
  @show res

@generated function Base.broadcast(f,B::Union{Number,ArrayPartition}...)
  @show B
  println(@which (add_idxs(B[1],:(B[1]))))
  exs = ((add_idxs(B[i],:(B[$i])) for i in eachindex(B))...)
  @show exs
  res = :(
  for i in eachindex(A.x)
  @show res

p .= (*).(p,5)
julia> p

julia> p .= (*).(p,5)
B = (RecursiveArrayTools.ArrayPartition{Tuple{Array{Float64,1},Array{Float64,1}}},)
add_idxs{T<:RecursiveArrayTools.ArrayPartition{T}}(::Type{T}, expr) at REPL[20]:1
exs = (:(B[1].x[i]),)
res = :(for i = eachindex(A.x) # REPL[21], line 8:

julia> p

My only problem is that the Numbers are now showing up. Other ArrayPartitions do. But multiplying by a number seems to occur anyways. Is that being fused and just doesn’t show up here? I am confused.


Numeric literals are inlined into the fused expressions as an optimization, e.g. y .+= x .* 3 is lowered to broadcast!((y,x) -> y + x*3, y, y, x).

However, if you assign a scalar to a variable, then it isn’t inlined. e.g. if you set const α = 3 and do y .+= x .* α, it is lowered to broadcast!((y,x,a) -> y + x*a, y, y, x, α)


Aha! That explains why the tests looked funny but still worked. Thanks. I’m going to go do this to a bunch of my array types, and probably make a blog post about it. Thanks for helping me out!