I have posted the same question on Are the examples in "Closures should be avoided whenever possible" still valid in Julia v1.9+? · Issue #34 · SciML/SciMLStyle · GitHub.
I just tested the examples in the section “Closures should be avoided whenever possible”. But it turns out that the suggested way (using Base.Fix2
) is no longer faster than using closures. For example,
julia> vector_of_vectors = [rand(4) for _ in 1:5];
julia> @code_warntype map(Base.Fix2(getindex, 2), vector_of_vectors)
MethodInstance for map(::Base.Fix2{typeof(getindex), Int64}, ::Vector{Vector{Float64}})
from map(f, A::AbstractArray) @ Base abstractarray.jl:3255
Arguments
#self#::Core.Const(map)
f::Base.Fix2{typeof(getindex), Int64}
A::Vector{Vector{Float64}}
Body::Vector{Float64}
1 ─ %1 = Base.Generator(f, A)::Base.Generator{Vector{Vector{Float64}}, Base.Fix2{typeof(getindex), Int64}}
│ %2 = Base.collect_similar(A, %1)::Vector{Float64}
└── return %2
julia> @code_warntype map(v -> v[2], vector_of_vectors)
MethodInstance for map(::var"#25#26", ::Vector{Vector{Float64}})
from map(f, A::AbstractArray) @ Base abstractarray.jl:3255
Arguments
#self#::Core.Const(map)
f::Core.Const(var"#25#26"())
A::Vector{Vector{Float64}}
Body::Vector{Float64}
1 ─ %1 = Base.Generator(f, A)::Base.Generator{Vector{Vector{Float64}}, var"#25#26"}
│ %2 = Base.collect_similar(A, %1)::Vector{Float64}
└── return %2
julia> @code_warntype Base.vect(v[2] for v in vector_of_vectors)
MethodInstance for Base.vect(::Base.Generator{Vector{Vector{Float64}}, var"#27#28"})
from vect(X::T...) where T @ Base array.jl:126
Static Parameters
T = Base.Generator{Vector{Vector{Float64}}, var"#27#28"}
Arguments
#self#::Core.Const(Base.vect)
X::Tuple{Base.Generator{Vector{Vector{Float64}}, var"#27#28"}}
Locals
@_3::Union{Nothing, Tuple{Int64, Int64}}
@_4::Int64
i::Int64
Body::Vector{Base.Generator{Vector{Vector{Float64}}, var"#27#28"}}
1 ─ %1 = Base.length(X)::Core.Const(1)
│ %2 = (1:%1)::Core.Const(1:1)
│ %3 = Base.IteratorSize(%2)::Core.Const(Base.HasShape{1}())
│ %4 = (%3 isa Base.SizeUnknown)::Core.Const(false)
│ %5 = Base._array_for($(Expr(:static_parameter, 1)), %2, %3)::Vector{Base.Generator{Vector{Vector{Float64}}, var"#27#28"}}
│ %6 = Base.LinearIndices(%5)::LinearIndices{1, Tuple{Base.OneTo{Int64}}}
│ (@_4 = Base.first(%6))
│ (@_3 = Base.iterate(%2))
│ %9 = (@_3::Core.Const((1, 1)) === nothing)::Core.Const(false)
│ %10 = Base.not_int(%9)::Core.Const(true)
└── goto #6 if not %10
2 ─ %12 = @_3::Core.Const((1, 1))
│ (i = Core.getfield(%12, 1))
│ %14 = Core.getfield(%12, 2)::Core.Const(1)
│ %15 = Base.getindex(X, i::Core.Const(1))::Base.Generator{Vector{Vector{Float64}}, var"#27#28"}
│ nothing
└── goto #4 if not %4
3 ─ Core.Const(:(Base.push!(%5, %15)))
└── Core.Const(:(goto %21))
4 ┄ Base.setindex!(%5, %15, @_4::Core.Const(1))
│ nothing
│ (@_4 = Base.add_int(@_4::Core.Const(1), 1))
│ (@_3 = Base.iterate(%2, %14))
│ %24 = (@_3::Core.Const(nothing) === nothing)::Core.Const(true)
│ %25 = Base.not_int(%24)::Core.Const(false)
└── goto #6 if not %25
5 ─ Core.Const(:(goto %12))
6 ┄ return %5
You would probably think the last one would be the slowest, but it turns out to be the fastest and the suggested way to be the slowest:
julia> @btime map(Base.Fix2(getindex, 2), vector_of_vectors);
216.746 ns (2 allocations: 128 bytes)
julia> @btime map(Base.Fix2(getindex, 2), $vector_of_vectors);
22.735 ns (1 allocation: 96 bytes)
julia> @btime map(v -> v[2], vector_of_vectors);
183.554 ns (2 allocations: 112 bytes)
julia> @btime map(v -> v[2], $vector_of_vectors);
22.526 ns (1 allocation: 96 bytes)
julia> @btime Base.vect(v[2] for v in vector_of_vectors);
90.162 ns (2 allocations: 80 bytes)
julia> @btime Base.vect(v[2] for v in $vector_of_vectors);
18.412 ns (1 allocation: 64 bytes)
Am I doing something wrong? Why is the last one the fastest? My Julia version is as follows:
In [30]: versioninfo()
Julia Version 1.9.0-rc1
Commit 3b2e0d8fbc1 (2023-03-07 07:51 UTC)
Platform Info:
OS: macOS (arm64-apple-darwin21.4.0)
CPU: 10 × Apple M1 Pro
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-14.0.6 (ORCJIT, apple-m1)
Threads: 1 on 8 virtual cores