Not quite so simple; on Julia 1.5 and 1.6 I get the same time from both of those, and both convert to Gray{Float32} as they go. To prevent mean from doing something sneaky, this may be better:
julia> function naivesum(A)
s = zero(eltype(A))
@inbounds @simd for a in A
s += a
end
return s
end
naivesum (generic function with 1 method)
julia> @btime naivesum($v1)
3.035 μs (0 allocations: 0 bytes)
Gray{N0f8}(0.125)
julia> @btime naivesum($v2)
16.653 μs (0 allocations: 0 bytes)
Gray{Float32}(131201.62f0)
As you can see, there’s actually a cost to promoting to Float32. Of course, some kind of promotion is necessary to get the right answer here.
EDIT: you can compare various strategies with
julia> function naivesum(::Type{T}, A) where T
s = zero(T)
@inbounds @simd for a in A
s += a
end
return s
end
naivesum (generic function with 2 methods)
julia> @btime naivesum(Gray{N24f8}, $v1)
8.226 μs (0 allocations: 0 bytes)
Gray{N24f8}(1.30728e5)
julia> @btime naivesum(Gray{Float32}, $v1)
13.343 μs (0 allocations: 0 bytes)
Gray{Float32}(130727.81f0)
which is one of the reasons I’ve wondered above promoting to a wider FPN type.