They key is that you are not really broadcasting over anything, so forcing this into the broadcasting framework is not idiomatic. You are generating a vector elementwise.
The problem with that approach is that it’s broadcasting a generator, which ends up materializing the generator first, hence the double allocations. If you’re intent on using broadcasting, you can do better by avoiding the generator as in this example:
exp.(1:10^6 .|>_-> rand())
As for performance, both versions benchmark about the same if using @btime, BUT it’s because @btime reports the minimum time of many invocations. Versions allocating more memory will have a higher average runtime due to less cache efficiency and garbage collection. See below; first @Tamas_Papp’s version:
julia> @benchmark [exp(rand()) for _ in 1:10^6]
memory estimate: 7.63 MiB
--------------
minimum time: 11.669 ms (0.00% GC)
median time: 12.656 ms (0.00% GC)
mean time: 13.682 ms (7.57% GC)
maximum time: 18.180 ms (17.96% GC)
--------------
samples: 366
Next my broadcasted version:
julia> @benchmark exp.(1:10^6 .|>_-> rand())
memory estimate: 7.63 MiB
--------------
minimum time: 11.263 ms (0.00% GC)
median time: 12.247 ms (0.00% GC)
mean time: 13.254 ms (7.66% GC)
maximum time: 17.265 ms (18.91% GC)
--------------
samples: 378
Finally the broadcasted generator (double allocations). Note how the minimum time is about the same, but average is quite a bit higher, and there’s more time spent GCing:
julia> @benchmark exp.(rand() for i=1:10^6)
memory estimate: 15.26 MiB
--------------
minimum time: 12.164 ms (0.00% GC)
median time: 17.054 ms (15.13% GC)
mean time: 16.380 ms (13.03% GC)
maximum time: 20.556 ms (19.80% GC)
--------------
samples: 305
Looking at the timings, I see that the two once-allocating versions suggested in this thread are actually slower than the twice-allocating version and the explicit map!() (which is the fastest):
julia> @benchmark exp.(rand(10^6))
BenchmarkTools.Trial:
memory estimate: 15.26 MiB
allocs estimate: 4
--------------
minimum time: 9.125 ms (0.00% GC)
median time: 9.517 ms (2.78% GC)
mean time: 9.590 ms (3.56% GC)
maximum time: 50.124 ms (80.48% GC)
--------------
samples: 522
evals/sample: 1
julia> @benchmark [exp(rand()) for _ in 1:10^6]
BenchmarkTools.Trial:
memory estimate: 7.63 MiB
allocs estimate: 2
--------------
minimum time: 10.398 ms (0.00% GC)
median time: 10.463 ms (0.00% GC)
mean time: 10.710 ms (2.36% GC)
maximum time: 51.215 ms (78.24% GC)
--------------
samples: 467
evals/sample: 1
julia> @benchmark exp.(1:10^6 .|> _ -> rand())
BenchmarkTools.Trial:
memory estimate: 7.63 MiB
allocs estimate: 2
--------------
minimum time: 10.348 ms (0.00% GC)
median time: 10.412 ms (0.00% GC)
mean time: 10.671 ms (2.41% GC)
maximum time: 51.442 ms (78.33% GC)
--------------
samples: 469
evals/sample: 1
julia> @benchmark (x = rand(10^6); map!(exp, x, x))
BenchmarkTools.Trial:
memory estimate: 7.63 MiB
allocs estimate: 2
--------------
minimum time: 8.997 ms (0.00% GC)
median time: 9.059 ms (0.00% GC)
mean time: 9.350 ms (3.04% GC)
maximum time: 49.337 ms (80.68% GC)
--------------
samples: 535
evals/sample: 1
I guess it’s because the vectorized version of rand() is faster:
julia> @benchmark [rand() for _ in 1:10^6]
BenchmarkTools.Trial:
memory estimate: 7.63 MiB
allocs estimate: 2
--------------
minimum time: 1.567 ms (0.00% GC)
median time: 1.581 ms (0.00% GC)
mean time: 1.673 ms (5.33% GC)
maximum time: 42.036 ms (94.77% GC)
--------------
samples: 2987
evals/sample: 1
julia> @benchmark rand(10^6)
BenchmarkTools.Trial:
memory estimate: 7.63 MiB
allocs estimate: 2
--------------
minimum time: 716.757 μs (0.00% GC)
median time: 731.317 μs (0.00% GC)
mean time: 805.241 μs (8.98% GC)
maximum time: 41.032 ms (96.89% GC)
--------------
samples: 6201
evals/sample: 1
Yes. Generators are not broadcastable, so they will be materialized during broadcasting:
julia> a = Base.broadcasted(_->rand(), 1:3) # this is 1:3 .|>_-> rand()
Base.Broadcast.Broadcasted(getfield(Main, Symbol("##271#272"))(), (1:3,))
julia> Broadcast.broadcastable(a) # same object, no materialization
Base.Broadcast.Broadcasted(getfield(Main, Symbol("##271#272"))(), (1:3,))
julia> b = (rand() for i=1:3)
Base.Generator{UnitRange{Int64},getfield(Main, Symbol("##273#274"))}(getfield(Main, Symbol("##273#274"))(), 1:3)
julia> Broadcast.broadcastable(b) # materialization happens here
3-element Array{Float64,1}:
0.9143249336534409
0.733436533928558
0.9617715309192958
As for why it’s done this way, perhaps someone more familiar with the internals can clarify, but I believe that it has to do with generators not being indexable: