CategoricalArray levels! allowmissing

I am ordering the categories according to their frequency. What is wrong?

using CategoricalArrays, FreqTables
x = ["a", "b", "c", "c", missing]
cx = CategoricalArray(x)
lvls = names(sort(freqtable(collect(skipmissing(cx))), rev=true))
levels!(cx, lvls, allowmissing=true)

ERROR: LoadError: MethodError: Cannot `convert` an object of type CategoricalVector{String, UInt32, String, CategoricalValue{String, UInt32}, Union{}} to an object of type String
1 Like

This is tricky indeed. You need to do:

lvls = unwrap.(names(sort(freqtable(collect(skipmissing(cx))), rev=true))[1])

@nalimilan - maybe we can consider allowing levels to be CategoricalValue also in levels! call?


Yeah, that’s something that I wanted to do for a long time. Let’s fix it at last: Support any `AbstractVector`s in `levels!` and `CategoricalPool` by nalimilan · Pull Request #365 · JuliaData/CategoricalArrays.jl · GitHub

Though note that in your example you’ll need levels!(cx, lvls[1], allowmissing=true).

We should probably provide more convenience methods to order levels by frequency and so on, like forcats in R.

1 Like

Now that this is merged, is the solution the same as before?