Here is your code from the notebook:
using DataFrames
using Tables
n = 10^2
ro_x_co_x_an = DataFrame(
"In"=>rand(1:9, n),
"Fl"=>rand(1.0:9, n),
"Ch"=>rand('a':'z', n),
"St4"=>[join(rand('a':'z', 4)) for _ in 1:n],
"St8"=>[join(rand('a':'z', 8)) for _ in 1:n],
)
# The part of the code to `zip`.
function mazi(ma)
for (a, b) in zip(ma[:, 4], ma[:, 5])
end
end
# The part of the code to `eachrow`.
function maea(ma)
for (a, b) in eachrow(ma[:, [4, 5]])
end
end
ma = Matrix(ro_x_co_x_an)
By the end of this script, we know that ma
is a Matrix{Any}
because the Matrix
function could figure this out from the value of ro_x_co_x_an
. This means that we cannot know the type at compile time.
Rather we have to figure out the type at runtime. This means that subsequent calls have to dynamically dispatch. We do not know which specialized version of getindex
, zip
, or eachrow
to call at compile time and neither do know the downstream types. The problem cascades. Rather we have to examine the result at runtime and figure out which methods to call. This takes time. You can think of dynamic dispatch as a big block of if
- elseif
conditions checking for the type of function. This causes a branch in the code, meaning the processor cannot start working ahead, especially given security concerns (e.g. Meltdown, Spectre).
If we can give the compiler an assertion or a hint of what types to expect, it can figure what methods to call at compile time. It can then perform optimizations such as inlining the downstream functions or perhaps not performing certain operations at all.
This statement is incorrect. We want the compiler to do more work at compile time, so we have to do less work at runtime and thus gain faster execution. Without the type information at compile time, the compiler cannot make accelerating optimizations as I mentioned above. Without such optimizations, Julia basically becomes an interpreter with high latency. Sometimes this is preferable to doing specialized compilation for each type variant.
This is the tradeoff that @bkamins discusses here:
The next issue to discuss is Matrix{Any}
. Any
can be a real issue for the same reasons of the above. We do not know which specific method to call for downstream operations. Any
is an abstract type and is not concrete.
julia> isabstracttype(Any)
true
julia> isconcretetype(Any)
false
With your code we know that the 4th and 5th columns have a String
element type. Rather than converting the Dataframe
to a Matrix{Any}
it would be better to select the columns first, preserving any potential type information.
function mzi(da)
ma = Matrix{Any}(da)
for (a, b) in zip(ma[:, 4], ma[:, 5])
end
end
# Side question: Why do you double space your code?
function mzi_str(da)
for (a, b) in zip(da.St4::Vector{String}, da.St8::Vector{String})
end
end
julia> @btime mzi($ro_x_co_x_an)
2.267 μs (104 allocations: 7.47 KiB)
julia> @btime mzi_str($ro_x_co_x_an)
74.587 ns (0 allocations: 0 bytes)
If we apply @code_warntype
to each of these functions, we see
julia> @code_warntype mzi(ro_x_co_x_an)
MethodInstance for mzi(::DataFrame)
from mzi(da) in Main at REPL[21]:1
Arguments
#self#::Core.Const(mzi)
da::DataFrame
Locals
@_3::Union{Nothing, Tuple{Tuple{Any, Any}, Tuple{Int64, Int64}}}
ma::Matrix{Any}
@_5::Int64
b::Any
a::Any
Body::Nothing
1 ─ %1 = Core.apply_type(Main.Matrix, Main.Any)::Core.Const(Matrix{Any})
│ (ma = (%1)(da))
│ %3 = Base.getindex(ma, Main.:(:), 4)::Vector{Any}
│ %4 = Base.getindex(ma, Main.:(:), 5)::Vector{Any}
│ %5 = Main.zip(%3, %4)::Base.Iterators.Zip{Tuple{Vector{Any}, Vector{Any}}}
│ (@_3 = Base.iterate(%5))
│ %7 = (@_3 === nothing)::Bool
│ %8 = Base.not_int(%7)::Bool
└── goto #4 if not %8
2 ┄ %10 = @_3::Tuple{Tuple{Any, Any}, Tuple{Int64, Int64}}
│ %11 = Core.getfield(%10, 1)::Tuple{Any, Any}
│ %12 = Base.indexed_iterate(%11, 1)::Core.PartialStruct(Tuple{Any, Int64}, Any[Any, Core.Const(2)])
│ (a = Core.getfield(%12, 1))
│ (@_5 = Core.getfield(%12, 2))
│ %15 = Base.indexed_iterate(%11, 2, @_5::Core.Const(2))::Core.PartialStruct(Tuple{Any, Int64}, Any[Any, Core.Const(3)])
│ (b = Core.getfield(%15, 1))
│ %17 = Core.getfield(%10, 2)::Tuple{Int64, Int64}
│ (@_3 = Base.iterate(%5, %17))
│ %19 = (@_3 === nothing)::Bool
│ %20 = Base.not_int(%19)::Bool
└── goto #4 if not %20
3 ─ goto #2
4 ┄ return nothing
julia> @code_warntype mzi_str(ro_x_co_x_an)
MethodInstance for mzi_str(::DataFrame)
from mzi_str(da) in Main at REPL[30]:1
Arguments
#self#::Core.Const(mzi_str)
da::DataFrame
Locals
@_3::Union{Nothing, Tuple{Tuple{String, String}, Tuple{Int64, Int64}}}
@_4::Int64
b::String
a::String
Body::Nothing
1 ─ %1 = Base.getproperty(da, :St4)::AbstractVector
│ %2 = Core.apply_type(Main.Vector, Main.String)::Core.Const(Vector{String})
│ %3 = Core.typeassert(%1, %2)::Vector{String}
│ %4 = Base.getproperty(da, :St8)::AbstractVector
│ %5 = Core.apply_type(Main.Vector, Main.String)::Core.Const(Vector{String})
│ %6 = Core.typeassert(%4, %5)::Vector{String}
│ %7 = Main.zip(%3, %6)::Base.Iterators.Zip{Tuple{Vector{String}, Vector{String}}}
│ (@_3 = Base.iterate(%7))
│ %9 = (@_3 === nothing)::Bool
│ %10 = Base.not_int(%9)::Bool
└── goto #4 if not %10
2 ┄ %12 = @_3::Tuple{Tuple{String, String}, Tuple{Int64, Int64}}
│ %13 = Core.getfield(%12, 1)::Tuple{String, String}
│ %14 = Base.indexed_iterate(%13, 1)::Core.PartialStruct(Tuple{String, Int64}, Any[String, Core.Const(2)])
│ (a = Core.getfield(%14, 1))
│ (@_4 = Core.getfield(%14, 2))
│ %17 = Base.indexed_iterate(%13, 2, @_4::Core.Const(2))::Core.PartialStruct(Tuple{String, Int64}, Any[String, Core.Const(3)])
│ (b = Core.getfield(%17, 1))
│ %19 = Core.getfield(%12, 2)::Tuple{Int64, Int64}
│ (@_3 = Base.iterate(%7, %19))
│ %21 = (@_3 === nothing)::Bool
│ %22 = Base.not_int(%21)::Bool
└── goto #4 if not %22
3 ─ goto #2
4 ┄ return nothing
Above we see that we can figure out that a
and b
will indeed be String
rather than having to dispatch for particular versions of a
and b
.
It can be quite misleading to do a lot of microbenchmarks like this. What you probably want to do is accelerate a complex function. However, when we pull out individual lines like you do in this example, you risk adding information that the compiler may not have or even having the compiler elide code completely. For the last result, I’m not certain that the for
loop does anything because the compiler might be able to figure out what are actually doing nothing in the loop. The compiler could figure out that it can just ignore the loop! In this case, that does not appear to happen, but as the compiler gets smarter over time, this could happen.