I have posted the same question on Are the examples in "Closures should be avoided whenever possible" still valid in Julia v1.9+? · Issue #34 · SciML/SciMLStyle · GitHub.
I just tested the examples in the section “Closures should be avoided whenever possible”. But it turns out that the suggested way (using Base.Fix2) is no longer faster than using closures. For example,
julia> vector_of_vectors = [rand(4) for _ in 1:5];
julia> @code_warntype map(Base.Fix2(getindex, 2), vector_of_vectors)
MethodInstance for map(::Base.Fix2{typeof(getindex), Int64}, ::Vector{Vector{Float64}})
from map(f, A::AbstractArray) @ Base abstractarray.jl:3255
Arguments
#self#::Core.Const(map)
f::Base.Fix2{typeof(getindex), Int64}
A::Vector{Vector{Float64}}
Body::Vector{Float64}
1 ─ %1 = Base.Generator(f, A)::Base.Generator{Vector{Vector{Float64}}, Base.Fix2{typeof(getindex), Int64}}
│ %2 = Base.collect_similar(A, %1)::Vector{Float64}
└── return %2
julia> @code_warntype map(v -> v[2], vector_of_vectors)
MethodInstance for map(::var"#25#26", ::Vector{Vector{Float64}})
from map(f, A::AbstractArray) @ Base abstractarray.jl:3255
Arguments
#self#::Core.Const(map)
f::Core.Const(var"#25#26"())
A::Vector{Vector{Float64}}
Body::Vector{Float64}
1 ─ %1 = Base.Generator(f, A)::Base.Generator{Vector{Vector{Float64}}, var"#25#26"}
│ %2 = Base.collect_similar(A, %1)::Vector{Float64}
└── return %2
julia> @code_warntype Base.vect(v[2] for v in vector_of_vectors)
MethodInstance for Base.vect(::Base.Generator{Vector{Vector{Float64}}, var"#27#28"})
from vect(X::T...) where T @ Base array.jl:126
Static Parameters
T = Base.Generator{Vector{Vector{Float64}}, var"#27#28"}
Arguments
#self#::Core.Const(Base.vect)
X::Tuple{Base.Generator{Vector{Vector{Float64}}, var"#27#28"}}
Locals
@_3::Union{Nothing, Tuple{Int64, Int64}}
@_4::Int64
i::Int64
Body::Vector{Base.Generator{Vector{Vector{Float64}}, var"#27#28"}}
1 ─ %1 = Base.length(X)::Core.Const(1)
│ %2 = (1:%1)::Core.Const(1:1)
│ %3 = Base.IteratorSize(%2)::Core.Const(Base.HasShape{1}())
│ %4 = (%3 isa Base.SizeUnknown)::Core.Const(false)
│ %5 = Base._array_for($(Expr(:static_parameter, 1)), %2, %3)::Vector{Base.Generator{Vector{Vector{Float64}}, var"#27#28"}}
│ %6 = Base.LinearIndices(%5)::LinearIndices{1, Tuple{Base.OneTo{Int64}}}
│ (@_4 = Base.first(%6))
│ (@_3 = Base.iterate(%2))
│ %9 = (@_3::Core.Const((1, 1)) === nothing)::Core.Const(false)
│ %10 = Base.not_int(%9)::Core.Const(true)
└── goto #6 if not %10
2 ─ %12 = @_3::Core.Const((1, 1))
│ (i = Core.getfield(%12, 1))
│ %14 = Core.getfield(%12, 2)::Core.Const(1)
│ %15 = Base.getindex(X, i::Core.Const(1))::Base.Generator{Vector{Vector{Float64}}, var"#27#28"}
│ nothing
└── goto #4 if not %4
3 ─ Core.Const(:(Base.push!(%5, %15)))
└── Core.Const(:(goto %21))
4 ┄ Base.setindex!(%5, %15, @_4::Core.Const(1))
│ nothing
│ (@_4 = Base.add_int(@_4::Core.Const(1), 1))
│ (@_3 = Base.iterate(%2, %14))
│ %24 = (@_3::Core.Const(nothing) === nothing)::Core.Const(true)
│ %25 = Base.not_int(%24)::Core.Const(false)
└── goto #6 if not %25
5 ─ Core.Const(:(goto %12))
6 ┄ return %5
You would probably think the last one would be the slowest, but it turns out to be the fastest and the suggested way to be the slowest:
julia> @btime map(Base.Fix2(getindex, 2), vector_of_vectors);
216.746 ns (2 allocations: 128 bytes)
julia> @btime map(Base.Fix2(getindex, 2), $vector_of_vectors);
22.735 ns (1 allocation: 96 bytes)
julia> @btime map(v -> v[2], vector_of_vectors);
183.554 ns (2 allocations: 112 bytes)
julia> @btime map(v -> v[2], $vector_of_vectors);
22.526 ns (1 allocation: 96 bytes)
julia> @btime Base.vect(v[2] for v in vector_of_vectors);
90.162 ns (2 allocations: 80 bytes)
julia> @btime Base.vect(v[2] for v in $vector_of_vectors);
18.412 ns (1 allocation: 64 bytes)
Am I doing something wrong? Why is the last one the fastest? My Julia version is as follows:
In [30]: versioninfo()
Julia Version 1.9.0-rc1
Commit 3b2e0d8fbc1 (2023-03-07 07:51 UTC)
Platform Info:
OS: macOS (arm64-apple-darwin21.4.0)
CPU: 10 × Apple M1 Pro
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-14.0.6 (ORCJIT, apple-m1)
Threads: 1 on 8 virtual cores