When to use broadcasting with . vs map

If I want to apply a function across say a Vector in julia, there are two ways I could do it:

map(f, seq)

f.(seq)

Is any way “better” than the other? In fact, is there any difference in the implementation between using map and broadcasting with . ?

4 Likes

On a user level, I prefer to use map when f is somehow complicated. It’s usually either when you need to write it as an anonymous function, or when you can’t use syntax sugar and have to call methods explicitly, or when function is so complicated that you want to use do syntax.

Example 1:

f(x, y) = x + y
v = [1, 2, 3]

map(x -> f(x, 1), v)
# vs
(x -> f(x, 1)).(v)

Example 2

v = [(1, 1), (2, 2), (3, 3)]

map(x -> x[2], v)
# vs
getindex.(v, 2)

Example 3:

v = [(0, 1), (1, 2), (2, 3)]

map(v) do x
   println(v[1])
   v[2]*v[2]
end

In all other cases, broadcasting is usually easier to read and use.

3 Likes

map (or a comprehension) is probably faster if seq is a generic iterable collection rather than an array, since broadcasting first calls collect to convert iterables into arrays. For example:

julia> @btime sqrt.(i^2 for i in 1:100);
  325.522 ns (2 allocations: 1.75 KiB)

julia> @btime map(sqrt, i^2 for i in 1:100);
  235.057 ns (1 allocation: 896 bytes)

Notice that the memory allocation is doubled in the broadcast case, because of the extra array allocated by collect.

25 Likes

In example-1, wouldn’t broadcasting look simpler this way:

julia> f.(v,1)
3-element Array{Int64,1}:
 2
 3
 4
3 Likes

Yeah, you are right, not the best illustration. What I was trying to say, there can be situations, when you have to write some sort of anonymous function. Maybe better example is hand-written implementation of the sign function

v = [-10, 0, 10]
map(x -> x > 0 ? 1 : x < 0 ? -1 : 0, v)
4 Likes

So broadcasting will . actually convert seq to an array, then eventually create and return something of the type that seq originally was, and then map directly creates a collection of the same type as the original seq?

I see what you mean here. I probably wouldn’t want to use that anonymous function by broadcasting either…

1 Like

If seq was already an array, no conversion is required. And broadcasting always produces an array, regardless of the type of seq.

If there are multiple arguments, then the behavior of broadcasting and map are quite different:

Broadcasting does… broadcasting, and map does not:

julia> [1] .+ [1, 2, 3]
3-element Array{Int64,1}:
 2
 3
 4

julia> map(+, [1], [1, 2, 3])
1-element Array{Int64,1}:
 2

julia> [1, 1] .+ [1, 2, 3]
ERROR: DimensionMismatch("arrays could not be broadcast to a common size; got a dimension with lengths 2 and 3")

julia> map(+, [1, 1], [1, 2, 3])
2-element Array{Int64,1}:
 2
 3

Also, broadcasting has some “symbolic” definitions for some types:

julia> (1:5) .+ (1:5)
2:2:10

julia> map(+, 1:5, 1:5)
5-element Array{Int64,1}:
  2
  4
  6
  8
 10

Unfortunately I don’t think there is yet a “standard” multi-argument “map” that is requires the arguments to have the same indices. (I asked here.)

4 Likes

Interesting, I didn’t know that. It’d be nice if generators participated in broadcast more naturally, is there an open issue about this?

1 Like

Thanks for pointing this out. I would have expected that map should expect matching dimensions. I think having a “strict map” can also be a good thing to avoid unexpected behavior if you accidentally pass it parameters of different sizes, especially for users that are coming from / used to Python’s map behavior.

Additionally map is preferable for mixed-type tuples. Map will be type-stable, while broadcast wont be.

3 Likes

I thought that would be the case, but look at this:

using Distributions
using BenchmarkTools
function random_dist()
       if rand() < 0.5
           return Normal(randn(), exp(randn()))
       else
           return Gamma(exp(randn()), exp(randn()))
       end
end
vdist = [random_dist() for ii in 1:1000]
tdist = Tuple(vdist)

julia> @btime map(mean, vdist);
  24.500 μs (1005 allocations: 23.62 KiB)

julia> @btime map(mean, tdist);
  755.500 μs (3492 allocations: 15.40 MiB)

julia> @btime broadcast(mean, vdist);
  38.100 μs (1498 allocations: 31.53 KiB)

julia> @btime broadcast(mean, tdist);
  738.000 μs (3493 allocations: 15.41 MiB)

julia> @btime mean.(vdist);
  37.900 μs (1500 allocations: 31.56 KiB)

julia> @btime mean.(tdist);
  2.644 μs (3 allocations: 39.25 KiB)

Not sure why mean.(tdist) is so fast, even faster than broadcast

Huh. This is weird. I thought mean.(tdist) and broadcast(mean, tdist) were exactly identical, that the former was just syntax sugar for the latter. Apparently not:

julia> @code_lowered mean.(tdist)
CodeInfo(
1 ─ %1 = Base.broadcasted(Main.mean, x1)
│   %2 = Base.materialize(%1)
└──      return %2
)

julia> @code_lowered broadcast(mean, tdist)
CodeInfo(
1 ─      nothing
│   %2 = Core.tuple(f, t)
│   %3 = Core._apply_iterate(Base.iterate, Base.Broadcast.map, %2, ts)
└──      return %3
)

For large tuples the performance difference is dramatic. This is surprising to me.

1 Like

Right? “Dot’s being syntactic sugar for broadcast” is exactly what I thought. Apparently not.

One reason for the . lowering is to enable broadcast fusion:

julia> @code_lowered exp.(log.(x))
CodeInfo(
1 ─ %1 = Base.broadcasted(Main.log, x1)
│   %2 = Base.broadcasted(Main.exp, %1)
│   %3 = Base.materialize(%2)
└──      return %3
)

As for why dot broadcast is faster, it appears to hit julia/broadcast.jl at v1.8.2 · JuliaLang/julia · GitHub in materialize. This unrolls the tuple, which avoids a dynamic dispatch per element and allows for a fully type stable (i.e. not boxed) return value.

In contrast, broadcast hits julia/tuple.jl at v1.8.2 · JuliaLang/julia · GitHub. That function looks a lot like a generic fallback, and it performs like one too. Based on julia/tuple.jl at v1.8.2 · JuliaLang/julia · GitHub, you would also get unrolling if length(tdist) < 32.

1 Like

These tuples are much too long for meaningful benchmarks. The rule of thumb is tuples are slower than vectors above a hundred elements.

I meant if you have to apply a function to five different objects in a Tuple, use map, not broadcast:

julia> t = (1.0, 1, 1.0f0, Int16(2), Int32(0))
(1.0, 1, 1.0f0, 2, 0)

julia> @btime map(x -> 2x, t)
  17.130 ns (1 allocation: 48 bytes)
(2.0, 2, 2.0f0, 4, 0)

julia> @btime broadcast(x -> 2x, t)
  69.652 ns (2 allocations: 96 bytes)
(2.0, 2, 2.0f0, 4, 0)

But I think these days there are more optimisations for broadcast now if you don’t use a global variable like this, so they’re about the same.

Note that broadcast is different from .:

julia> t = (1.0, 1, 1.0f0, Int16(2), Int32(0))
(1.0, 1, 1.0f0, 2, 0)

julia> @btime abs2.($t)
  366.188 ns (10 allocations: 352 bytes)
(1.0, 1, 1.0f0, 4, 0)

julia> @btime map(abs2, $t)
  1.458 ns (0 allocations: 0 bytes)
(1.0, 1, 1.0f0, 4, 0)

julia> @btime broadcast(abs2, $t)
  1.458 ns (0 allocations: 0 bytes)
(1.0, 1, 1.0f0, 4, 0)

julia> @btime Base.materialize(Base.broadcasted(abs2, $t))
  366.426 ns (10 allocations: 352 bytes)
(1.0, 1, 1.0f0, 4, 0)
1 Like

but ?broadcast says

A special syntax exists for broadcasting: f.(args…) is equivalent to broadcast(f, args…)

That’s bad. Looks like abs2.(t) is not fully inlined thus the unrolling is not working here.