`map` vs `broadcast`: should one prefer `map` if these are equivalent?

I’m referring to the simplest case where map and broadcast are equivalent, that is – given a vector/tuple, I want to map a function over it to obtain another container. broadcast seems a natural choice here, and map seems more suited to generic iterables that don’t support indexing.

However, we look at the performance in cases where the element types may not be inferred concretely, in a simple case of computing the sizes along each dimension of an array, given the axes:

julia> x = (1:3, 1:3); y = [x];

julia> @btime (x -> map(length, x[1]))($y);
  3.368 ns (0 allocations: 0 bytes)

julia> @btime (x -> length.(x[1]))($y);
  3.368 ns (0 allocations: 0 bytes)

So, in type-stable cases, these two are equivalent. However, here’s an instance where types aren’t inferred:

julia> x = (1:3, 1:3); y = Any[x];

julia> @btime (x -> map(length, x[1]))($y);
  23.183 ns (1 allocation: 32 bytes)

julia> @btime (x -> length.(x[1]))($y);
  478.985 ns (3 allocations: 128 bytes)

In such cases, map may be significantly faster. To me, this seems to suggest that one should choose map over broadcasting wherever these are equivalent.

For anyone wondering if there are cases where such an operation wouldn’t be concretely inferred: yes, there are. ApproxFun is littered with such cases, since it deals with both finite and infinite matrices, often in a type-unstable manner.

I’ve submitted one such PR to FillArrays.jl recently. I wonder if this is an intrinsic limitation because broadcast is a more complicated operation, in which case developers should be encouraged to prefer map if that’s an option?

2 Likes

Earlier thread on same topic:

I wonder if this is an intrinsic limitation because broadcast is a more complicated operation, in which case developers should be encouraged to prefer map if that’s an option?

That’s the way I understand it, at least.

The Julia Manual actually even says that . isn’t required for performance, but that it’s convenient:
https://docs.julialang.org/en/v1/manual/functions/#man-vectorized

In Julia, vectorized functions are not required for performance, and indeed it is often beneficial to write your own loops (see Performance Tips), but they can still be convenient.

Also to note: there are subtle performance considerations to keep in mind when using broadcasting:
https://docs.julialang.org/en/v1/manual/performance-tips/#More-dots:-Fuse-vectorized-operations

I think broadcast on tuples just takes an exceptionally bad code path and should be avoided wherever possible (or really, fixed in Base). In general though, my experience is that sometimes broadcast infers better and other times map does.

1 Like

I have replaced many broadcasts with maps in the past because for some reason broadcast didn’t infer. Sometimes I also had to split up expressions as it seemed that with too many arithmetic operators it didn’t infer, but it worked in two parts. But those were usually static arrays, so might be the same thing that @ToucheSir calls “exceptionally bad code paths” for tuples.