How can I fix the following type unstability? (Example of a Type unstable function)

I’ve defined the following function:

using DataFrames

function Silouhette(
  input_data,
  model::KmeansResult
  )::Float64

  if Tables.istable(input_data)
    RunSilouhette(Tables.matrix(input_data), model)
  else
    RunSilouhette(input_data, model)
  end
end

The result of @code_warntype for this function and the following input is:

julia> using ClusterAnalysis

julia> a = rand(10, 2);

julia> model = kmeans(a, 3);

julia> @code_warntype(
         Silouhette(
           DataFrame(a, :auto),
           model
         )
       )
MethodInstance for Silouhette(::DataFrame, ::KmeansResult{Float64})
  from Silouhette(input_data, model::KmeansResult) in Main at e:\Julia Forks\ClusterAnalysis.jl\src\Sil.jl:149   
Arguments
  #self#::Core.Const(Silouhette)
  input_data::DataFrame
  model::KmeansResult{Float64}
Body::Float64
1 ─ %1 = Main.Float64::Core.Const(Float64)
│   %2 = Tables.istable::Core.Const(Tables.istable)
│   %3 = (%2)(input_data)::Core.Const(true)
│        Core.typeassert(%3, Core.Bool)
│   %5 = Tables.matrix::Core.Const(Tables.matrix)
│   %6 = (%5)(input_data)::Matrix
│   %7 = Main.RunSilouhette(%6, model)::Float64
│   %8 = Base.convert(%1, %7)::Float64
│   %9 = Core.typeassert(%8, %1)::Float64
└──      return %9
2 ─      Core.Const(:(Main.RunSilouhette(input_data, model)))
│        Core.Const(:(Base.convert(%1, %11)))
│        Core.Const(:(Core.typeassert(%12, %1)))
└──      Core.Const(:(return %13))

The problem is where I get %6 = (%5)(input_data)::Matrix. The Matrix has been written in red color. How can I make that type stable?

One way can be using Matrix{Float64}(input_data) rather than Tables.matrix(input_data):

function Silouhette(
  input_data,
  model::KmeansResult
  )::Float64

  if Tables.istable(input_data)
    RunSilouhette(Matrix{Float64}(input_data), model)
  else
    RunSilouhette(input_data, model)
  end
end

Hi @fatteneder! Thank you.

The RunSilouhette function calculates the silhouette coefficient of the clustering, and it should get the data as an object of subtype AbstractMatrix. So dispatching won’t help unless I change the content of RunSilouhette. Since that function is a bit lengthy, copying it for two versions of input data wouldn’t be a good idea (in that case, it will take much space in the script.). Please correct me if you think I’m wrong. Thank you!

What do you mean by no-op? I didn’t understand.

Nevermind my previous post.

One way can be using Matrix{Float64}(input_data) rather than Tables.matrix(input_data):

The docs say

help?> Tables.matrix
  Tables.matrix(table; transpose::Bool=false)

  Materialize any table source input as a new Matrix or in the case of a MatrixTable return the originally
  wrapped matrix. If the table column element types are not homogenous, they will be promoted to a common type
  in the materialized Matrix. Note that column names are ignored in the conversion. By default, input table
  columns will be materialized as corresponding matrix columns; passing transpose=true will transpose the
  input with input columns as matrix rows or in the case of a MatrixTable apply permutedims to the originally
  wrapped matrix.

I think the instability comes from the fact that Tables.matrix does automatic type promotion and so its return type is not immediately inferable from its arguments.

1 Like

The RunSilouhette function calculates the silhouette coefficient of the clustering, and it should get the data as an object of subtype AbstractMatrix. So dispatching won’t help unless I change the content of RunSilouhette.

The idea was to contain the type stability within a function barrier, which would allow you to make Silouhette type stable:

using DataFrames


function RunSilouhette(input::AbstractMatrix, model)::Float64
    return sum(input[:])
end

function RunSilouhette(input, model)
    println("yes, we are being called")
    return RunSilouhette(Tables.matrix(input), model)
end

function Silouhette(
        input_data,
        model
    )
    return RunSilouhette(input_data, model)
end


model = :some_model
df = DataFrame(:a => randn(5), :b => randn(5))

And this gives

julia> @code_warntype Silouhette(df, model)
MethodInstance for Silouhette(::DataFrame, ::Symbol)
  from Silouhette(input_data, model) in Main at /home/.../mwe.jl:14
Arguments
  #self#::Core.Const(Silouhette)
  input_data::DataFrame
  model::Symbol
Body::Float64
1 ─ %1 = Main.RunSilouhette(input_data, model)::Float64
└──      return %1


julia> Silouhette(df, model)
yes, we are being called
-0.8957902425154177

So this does not magically remove the type instability, it just hides it somewhere such that you can then write the Silouhette code in a type stable manner.

What do you mean by no-op? I didn’t understand.

No operation:The above @code_warntype output shows that it does not do ‘real’ work and only calls RunSilouhette. It would be better do add some more code to Silouhette, otherwise one might ask about why to even make it type stable :slightly_smiling_face: