@nilshg’s suggestion will work but is not recommended as this is not part of the public API.
Accessing the probabilities is described in the Working with Categorical Data section of the manual (see also this section of “Getting Started”). Here are some more examples:
julia> y = coerce(["c", "b", "a"], Multiclass)
3-element CategoricalArrays.CategoricalArray{String,1,UInt32}:
"c"
"b"
"a"
julia> d = UnivariateFinite(["a", "c"], [0.1, 0.9], pool=y)
UnivariateFinite{Multiclass{3}}(a=>0.1, c=>0.9)
julia> pdf(d, "a")
0.1
julia> pdf(d, levels(y))
3-element Vector{Float64}:
0.1
0.0
0.9
And for a vector of distributions:
julia> d_vector = UnivariateFinite(["a", "b"], [0.1 0.9; 0.4 0.6], pool=missing)
2-element MLJBase.UnivariateFiniteVector{Multiclass{2}, String, UInt8, Float64}:
UnivariateFinite{Multiclass{2}}(a=>0.1, b=>0.9)
UnivariateFinite{Multiclass{2}}(a=>0.4, b=>0.6)
julia> broadcast(pdf, d_vector, "a")
2-element Vector{Float64}:
0.1
0.4
julia> pdf(d_vector, ["a", "b"])
2Ă—2 Matrix{Float64}:
0.1 0.9
0.4 0.6
julia> pdf(d_vector, ["b", "a"])
2Ă—2 Matrix{Float64}:
0.9 0.1
0.6 0.4
In basic MLJ workflow you shouldn’t really need the probabilities in matrix form. For example, all probabilisitic measures in MLJ (eg, LogLoss()
) expect distributions for first argument, not numerical probabilities or parameters:
julia> y = coerce(rand(["a", "b"], 10), OrderedFactor)
10-element CategoricalArrays.CategoricalArray{String,1,UInt32}:
"a"
"b"
"b"
"b"
"b"
"a"
"b"
"a"
"a"
"a"
julia> yhat = UnivariateFinite(["a", "b"], rand(10), augment=true, pool=y)
10-element MLJBase.UnivariateFiniteVector{OrderedFactor{2}, String, UInt32, Float64}:
UnivariateFinite{OrderedFactor{2}}(a=>0.863, b=>0.137)
UnivariateFinite{OrderedFactor{2}}(a=>0.995, b=>0.00547)
UnivariateFinite{OrderedFactor{2}}(a=>0.0523, b=>0.948)
UnivariateFinite{OrderedFactor{2}}(a=>0.859, b=>0.141)
UnivariateFinite{OrderedFactor{2}}(a=>0.216, b=>0.784)
UnivariateFinite{OrderedFactor{2}}(a=>0.277, b=>0.723)
UnivariateFinite{OrderedFactor{2}}(a=>0.985, b=>0.0148)
UnivariateFinite{OrderedFactor{2}}(a=>0.206, b=>0.794)
UnivariateFinite{OrderedFactor{2}}(a=>0.373, b=>0.627)
UnivariateFinite{OrderedFactor{2}}(a=>0.553, b=>0.447)
julia> LogLoss()(yhat, y)
10-element Vector{Float64}:
0.14702211036373602
5.20855707692713
0.053705325134450276
1.9585781504187798
0.24387484194022874
1.2835655183125487
4.210745321783671
1.5783092501620064
0.9856382706229556
0.5921152165139617
Hope this helps!