Matrix vs Matrix{Any}

KwatME · November 27, 2022, 5:55pm

Can someone please explain why separation of the code that loops the Matrix improves the code? In code_warntype, I noticed the difference between Matrix and Matrix{Any}. Thank you.

The source is here https://github.com/KwatMDPhD/Play.pro/blob/main/code/learn/21.data_frame_each_function_barrier.ipynb

mkitti · November 27, 2022, 6:07pm

julia> isconcretetype(Matrix)
false

julia> isconcretetype(Matrix{Any})
true

KwatME · November 27, 2022, 6:12pm

Thanks! Then how come the same piece of code Matrix(ro_x_co_x_an) makes these 2 different objects?

KwatME · November 27, 2022, 6:24pm

Is this because in mt = Matrix(dataframe); function(mt); ... the compiler (at compile time) does not know the type of mt because dataframe has fields that are parameterized?

KwatME · November 27, 2022, 6:43pm

Here is my summary.

The compiler optimizes code at function boundaries.
Use multiple smaller functions!

But what is really happening here is that a DataFrame columns can be of any type.
DataFrame is not parameterized but its fields (columns) are.
So, the columns can change during run time and the compiler can not know about them or objects (like Matirx) derived from them.

When using the function barrier, the compiler knows that it gets a DataFrame-derived Matrix but not its type, and dispatches to Matirx{Any}.
The compiler does so because Matirx is not a concrete type but Matrix{Any} is.

In the slower code, if we helped the compiler know that the Matrix is Matrix{Any}, the compiler does less work and the performance is even better than using the function barrier.

mkitti · November 27, 2022, 10:03pm

Here is your code from the notebook:

using DataFrames
using Tables

n = 10^2

ro_x_co_x_an = DataFrame(
    "In"=>rand(1:9, n),
    "Fl"=>rand(1.0:9, n),
    "Ch"=>rand('a':'z', n),
    "St4"=>[join(rand('a':'z', 4)) for _ in 1:n],
    "St8"=>[join(rand('a':'z', 8)) for _ in 1:n],
)

# The part of the code to `zip`.
function mazi(ma)
    
    for (a, b) in zip(ma[:, 4], ma[:, 5])
        
    end
    
end

# The part of the code to `eachrow`.
function maea(ma)
    
    for (a, b) in eachrow(ma[:, [4, 5]])
        
    end
    
end

ma = Matrix(ro_x_co_x_an)

By the end of this script, we know that ma is a Matrix{Any} because the Matrix function could figure this out from the value of ro_x_co_x_an. This means that we cannot know the type at compile time.

Rather we have to figure out the type at runtime. This means that subsequent calls have to dynamically dispatch. We do not know which specialized version of getindex, zip, or eachrow to call at compile time and neither do know the downstream types. The problem cascades. Rather we have to examine the result at runtime and figure out which methods to call. This takes time. You can think of dynamic dispatch as a big block of if - elseif conditions checking for the type of function. This causes a branch in the code, meaning the processor cannot start working ahead, especially given security concerns (e.g. Meltdown, Spectre).

If we can give the compiler an assertion or a hint of what types to expect, it can figure what methods to call at compile time. It can then perform optimizations such as inlining the downstream functions or perhaps not performing certain operations at all.

This statement is incorrect. We want the compiler to do more work at compile time, so we have to do less work at runtime and thus gain faster execution. Without the type information at compile time, the compiler cannot make accelerating optimizations as I mentioned above. Without such optimizations, Julia basically becomes an interpreter with high latency. Sometimes this is preferable to doing specialized compilation for each type variant.

This is the tradeoff that @bkamins discusses here:

The next issue to discuss is Matrix{Any}. Any can be a real issue for the same reasons of the above. We do not know which specific method to call for downstream operations. Any is an abstract type and is not concrete.

julia> isabstracttype(Any)
true

julia> isconcretetype(Any)
false

With your code we know that the 4th and 5th columns have a String element type. Rather than converting the Dataframe to a Matrix{Any} it would be better to select the columns first, preserving any potential type information.

function mzi(da)
    
    ma = Matrix{Any}(da)
    
    for (a, b) in zip(ma[:, 4], ma[:, 5])
        
    end
    
end
# Side question: Why do you double space your code?

function mzi_str(da)
    for (a, b) in zip(da.St4::Vector{String}, da.St8::Vector{String})
    end
end

julia> @btime mzi($ro_x_co_x_an)
  2.267 μs (104 allocations: 7.47 KiB)

julia> @btime mzi_str($ro_x_co_x_an)
  74.587 ns (0 allocations: 0 bytes)

If we apply @code_warntype to each of these functions, we see

julia> @code_warntype mzi(ro_x_co_x_an)
MethodInstance for mzi(::DataFrame)
  from mzi(da) in Main at REPL[21]:1
Arguments
  #self#::Core.Const(mzi)
  da::DataFrame
Locals
  @_3::Union{Nothing, Tuple{Tuple{Any, Any}, Tuple{Int64, Int64}}}
  ma::Matrix{Any}
  @_5::Int64
  b::Any
  a::Any
Body::Nothing
1 ─ %1  = Core.apply_type(Main.Matrix, Main.Any)::Core.Const(Matrix{Any})
│         (ma = (%1)(da))
│   %3  = Base.getindex(ma, Main.:(:), 4)::Vector{Any}
│   %4  = Base.getindex(ma, Main.:(:), 5)::Vector{Any}
│   %5  = Main.zip(%3, %4)::Base.Iterators.Zip{Tuple{Vector{Any}, Vector{Any}}}
│         (@_3 = Base.iterate(%5))
│   %7  = (@_3 === nothing)::Bool
│   %8  = Base.not_int(%7)::Bool
└──       goto #4 if not %8
2 ┄ %10 = @_3::Tuple{Tuple{Any, Any}, Tuple{Int64, Int64}}
│   %11 = Core.getfield(%10, 1)::Tuple{Any, Any}
│   %12 = Base.indexed_iterate(%11, 1)::Core.PartialStruct(Tuple{Any, Int64}, Any[Any, Core.Const(2)])
│         (a = Core.getfield(%12, 1))
│         (@_5 = Core.getfield(%12, 2))
│   %15 = Base.indexed_iterate(%11, 2, @_5::Core.Const(2))::Core.PartialStruct(Tuple{Any, Int64}, Any[Any, Core.Const(3)])
│         (b = Core.getfield(%15, 1))
│   %17 = Core.getfield(%10, 2)::Tuple{Int64, Int64}
│         (@_3 = Base.iterate(%5, %17))
│   %19 = (@_3 === nothing)::Bool
│   %20 = Base.not_int(%19)::Bool
└──       goto #4 if not %20
3 ─       goto #2
4 ┄       return nothing


julia> @code_warntype mzi_str(ro_x_co_x_an)
MethodInstance for mzi_str(::DataFrame)
  from mzi_str(da) in Main at REPL[30]:1
Arguments
  #self#::Core.Const(mzi_str)
  da::DataFrame
Locals
  @_3::Union{Nothing, Tuple{Tuple{String, String}, Tuple{Int64, Int64}}}
  @_4::Int64
  b::String
  a::String
Body::Nothing
1 ─ %1  = Base.getproperty(da, :St4)::AbstractVector
│   %2  = Core.apply_type(Main.Vector, Main.String)::Core.Const(Vector{String})
│   %3  = Core.typeassert(%1, %2)::Vector{String}
│   %4  = Base.getproperty(da, :St8)::AbstractVector
│   %5  = Core.apply_type(Main.Vector, Main.String)::Core.Const(Vector{String})
│   %6  = Core.typeassert(%4, %5)::Vector{String}
│   %7  = Main.zip(%3, %6)::Base.Iterators.Zip{Tuple{Vector{String}, Vector{String}}}
│         (@_3 = Base.iterate(%7))
│   %9  = (@_3 === nothing)::Bool
│   %10 = Base.not_int(%9)::Bool
└──       goto #4 if not %10
2 ┄ %12 = @_3::Tuple{Tuple{String, String}, Tuple{Int64, Int64}}
│   %13 = Core.getfield(%12, 1)::Tuple{String, String}
│   %14 = Base.indexed_iterate(%13, 1)::Core.PartialStruct(Tuple{String, Int64}, Any[String, Core.Const(2)])
│         (a = Core.getfield(%14, 1))
│         (@_4 = Core.getfield(%14, 2))
│   %17 = Base.indexed_iterate(%13, 2, @_4::Core.Const(2))::Core.PartialStruct(Tuple{String, Int64}, Any[String, Core.Const(3)])
│         (b = Core.getfield(%17, 1))
│   %19 = Core.getfield(%12, 2)::Tuple{Int64, Int64}
│         (@_3 = Base.iterate(%7, %19))
│   %21 = (@_3 === nothing)::Bool
│   %22 = Base.not_int(%21)::Bool
└──       goto #4 if not %22
3 ─       goto #2
4 ┄       return nothing

Above we see that we can figure out that a and b will indeed be String rather than having to dispatch for particular versions of a and b.

It can be quite misleading to do a lot of microbenchmarks like this. What you probably want to do is accelerate a complex function. However, when we pull out individual lines like you do in this example, you risk adding information that the compiler may not have or even having the compiler elide code completely. For the last result, I’m not certain that the for loop does anything because the compiler might be able to figure out what are actually doing nothing in the loop. The compiler could figure out that it can just ignore the loop! In this case, that does not appear to happen, but as the compiler gets smarter over time, this could happen.

KwatME · November 27, 2022, 11:21pm

Thank you so much @mkitti . I learned so much from this amazing writing.

Topic		Replies	Views
Improving performance elegantly - type stability General Usage type-stability	20	1408	January 13, 2019
Allocation-free type hiding? Performance question , memory-allocation	14	738	August 19, 2021
Understanding the performance costs of generating functions on the fly Performance	10	1192	October 11, 2020
Performance issue due to function as an argument General Usage question , performance	16	851	September 22, 2023
How can I fix the following type unstability? (Example of a Type unstable function) Performance code_warntype , type-stability	4	295	January 4, 2023

Matrix vs Matrix{Any}

Related topics