Is broadcast! supposed to work on StaticArrays?

aaowens · March 26, 2018, 12:09am

julia> sv = SVector{3}(ones(3))
3-element StaticArrays.SArray{Tuple{3},Float64,1,3}:
 1.0
 1.0
 1.0

julia> sv .= sv .+ sv
ERROR: setindex!(::StaticArrays.SArray{Tuple{3},Float64,1,3}, value, ::Int) is not defined.
Stacktrace:
 [1] setindex!(::StaticArrays.SArray{Tuple{3},Float64,1,3}, ::Float64, ::Int64) at /home/andrew/.julia/v0.6/StaticArrays/src/indexing.jl:3
 [2] macro expansion at /home/andrew/.julia/v0.6/StaticArrays/src/broadcast.jl:215 [inlined]
 [3] _broadcast!(::##3#4, ::StaticArrays.Size{(3,)}, ::StaticArrays.SArray{Tuple{3},Float64,1,3}, ::Tuple{StaticArrays.Size{(3,)}}, ::StaticArrays.SArray{Tuple{3},Float64,1,3}) at /home/andrew/.julia/v0.6/StaticArrays/src/broadcast.jl:170
 [4] broadcast!(::Function, ::StaticArrays.SArray{Tuple{3},Float64,1,3}, ::StaticArrays.SArray{Tuple{3},Float64,1,3}) at ./broadcast.jl:206

If this worked, .= would work as expected regardless of if sv is an Array or StaticArray.

ChrisRackauckas · March 26, 2018, 12:22am

StaticArrays are not mutable, so this cannot work on them since .= lowers to broadcast! which mutates the values instead of returning them. Besides, SArrays shouldn’t be broadcasted anyways. You’ll notice it’s faster to let them utilize their unrolled fast arithmetic rather than using the anonymous function of broadcast. You need to code for these two cases very differently.

vchuravy · March 26, 2018, 12:54am

I disagree. There is no reason why you shouldn’t use broadcast and SArrays. The unrolled cases come into play for linear algebra operations, but there is no relation with elementwise operations . If broadcast is indeed slow for SArrays we need to fix it (and there is no apriori reason why it shouldn’t be fast, since you could unroll the entire broadcast operation).

The question whether or not .= should work for SArrays is debatable. Since the SArray is immutable asking for it to be overwritten makes little sense, but I would be in favour of making .= mean = for SArrays (and other immutable containers). Being able to use broadcast in generic code is important.

Right now the lowering of .= prevents SArray from defining that.

ChrisRackauckas · March 26, 2018, 1:01am

Sure, there’s no a priori reason why it shouldn’t be fast, but right now it isn’t so it should be avoided. If that gets changed then that would help a lot. I wasn’t talking about a hypothetical future, I was talking about writing code right now (this has a Usage tag and not a Development tag. I agree that the noted issues maybe could be solved in the future, but I don’t see a reason to believe it will change soon at all).

aaowens · March 26, 2018, 1:08am

Ok. This came up because I was trying to use LineSearches.jl with an SVector. It’s typed for AbstractArray but fails at a .= line.

tkoolen · March 26, 2018, 1:48am

StaticArrays’s broadcast and broadcast! are fully unrolled (as in, the generated function creates an expression for each element of the destination without any runtime for-loops). +, -, etc. are all defined in terms of broadcast and map, e.g.

@inline +(a::Number, b::StaticArray) = broadcast(+, a, b)

broadcast! to MArray is also specialized, and with Fix #385, broadcast! with non-static LHS broken on Julia master by tkoolen · Pull Request #387 · JuliaArrays/StaticArrays.jl · GitHub, broadcast!ing statically sized arguments to an Array will also use that machinery instead of the generic broadcast! in Base.

ChrisRackauckas · March 26, 2018, 2:00am

That’s not what shows up empirically. I remember we did a big change because of it: Don't broadcast out-of-place · Issue #235 · SciML/OrdinaryDiffEq.jl · GitHub. Though there may have been something other Base broadcasting bug involved like Performance issues with fusing 12+ broadcasts · Issue #22255 · JuliaLang/julia · GitHub. Just a quick example:

function f(x,y,z)
    x .+ y .+ z
end
function g(x,y,z)
    x + y + z
end
using StaticArrays, BenchmarkTools
x = @SVector [2.0,3.0,4.0]
y = @SVector [2.0,3.0,4.0]
z = @SVector [2.0,3.0,4.0]

@btime f(x,y,z) # 33.372 μs (21 allocations: 768 bytes)
@btime g(x,y,z) # 19.085 ns (1 allocation: 32 bytes)

@btime f($x,$y,$z) # 3.220 ns (0 allocations: 0 bytes)
@btime g($x,$y,$z) # 3.220 ns (0 allocations: 0 bytes)

So I don’t know, but removing broadcast was a big improvement in the larger package for some reason . (But we can discuss this more in a separate thread)

tkoolen · March 26, 2018, 2:21am

Not sure if I’m a fan of that. I think it could be pretty confusing. Also, to support both mutable and non-mutable arrays, you’d have to do something like

function f(x::AbstractArray)
    y = similar(x) # needed for Array, what should `similar` do for StaticArray?
    y .= x.^2
    y
end

similar(::StaticArray) could just call zeros, but I’d be worried that the compiler wouldn’t be able to optimize this away, especially if the broadcast!ed function is not inlined. So you’re going to have suboptimal performance for StaticArrays, and I think quite a few StaticArrays users want to squeeze every bit of performance out of the algorithm.

More or less related: Fix #385, broadcast! with non-static LHS broken on Julia master by tkoolen · Pull Request #387 · JuliaArrays/StaticArrays.jl · GitHub.

tkoolen · March 26, 2018, 2:43am

The results for non-interpolated x, y, and z can probably be explained by having to perform runtime dispatch for every element (with f) vs. only once (with g). So I can see how it can make a sizeable difference in type-unstable code.

If x, y, and z ~~all have the same dimensions~~ are all StaticArrays, + will actually call map (which is also specialized and fully unrolled in StaticArrays). So another difference could be that there’s no “loop fusion” (quotes because there is no actual loop) with x + y + z, while x .+ y .+ z is fused. Maybe that results in operations being performed in a different order, which could have some influence. Some of those differences might go away as the compiler performs optimization, but I guess there could be differences.

Edit: also, the fact that several unfused maps are called vs. one fused broadcast could certainly have been a big difference in light of inference limits, as in your https://github.com/JuliaLang/julia/issues/22255.

Topic		Replies	Views
Slowdown with StaticArrays and broadcasting General Usage broadcast , array	5	706	October 17, 2018
Broadcasting with vectors of static vectors General Usage broadcast	11	2779	July 30, 2018
Custom broadcasting for static, immutable type General Usage question , broadcast , struct , staticarrays	1	550	August 2, 2022
Static array over 1,024 elements is slow to compile? General Usage	11	3129	March 25, 2021
Copyto! issue when updating fields of struct General Usage question , performance , mutable-structure , staticarrays	8	503	July 31, 2022

Is broadcast! supposed to work on StaticArrays?

Related topics