You should not benchmark in global scope.
using BenchmarkTools
N = 10000000
A = randn(N)
sum1(A) = sum(A)
sum2(A) = sum(A[i] for i in 1:N)
sum3(A) = sum(a for a in A)
then
julia> @benchmark sum1($A)
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 4.576 ms (0.00% GC)
median time: 5.744 ms (0.00% GC)
mean time: 5.672 ms (0.00% GC)
maximum time: 6.901 ms (0.00% GC)
--------------
samples: 881
evals/sample: 1
julia> @benchmark sum2($A)
BenchmarkTools.Trial:
memory estimate: 112 bytes
allocs estimate: 5
--------------
minimum time: 12.764 ms (0.00% GC)
median time: 12.885 ms (0.00% GC)
mean time: 12.972 ms (0.00% GC)
maximum time: 16.242 ms (0.00% GC)
--------------
samples: 386
evals/sample: 1
julia> @benchmark sum3($A)
BenchmarkTools.Trial:
memory estimate: 48 bytes
allocs estimate: 3
--------------
minimum time: 12.761 ms (0.00% GC)
median time: 12.874 ms (0.00% GC)
mean time: 12.941 ms (0.00% GC)
maximum time: 16.360 ms (0.00% GC)
--------------
samples: 387
evals/sample: 1
so the difference is 2-3x. Above using v0.6-rc1
.