How to fit a Dirichlet Distribution using MLE in Distributions.jl?

stochastic_guy · November 5, 2020, 9:30pm

I would like to fit Dirichlet Distribution to all_patterns using MLE in the Distributions package
where alpha to Dirichlet is fill(1,k) and all_patterns is a combination array

all_patterns = (p=(a,c)->c<2 ? a : [[x;y] for x=a for y=p(a,c-1)])(0:1,K)
all_patterns = convert(Array,VectorOfArray(all_patterns)')

and this is what I tried [but not working ]

fit_mle(Dirichlet,fill(1,K),all_patterns)

using the modified interface specified here: Distribution Fitting · Distributions.jl

mthelm85 · November 5, 2020, 11:40pm

Do you need to use floating-point numbers? The code below works:

using Distributions

K = 3
all_patterns = (p=(a,c)->c<2 ? a : [[x;y] for x=a for y=p(a,c-1)])(1:2,K)
all_patterns = hcat(all_patterns...)

julia> fit(Dirichlet, Float64.(all_patterns))
Dirichlet{Float64}(alpha=[1.677728917182107e-5, 1.677728917182107e-5, 1.677728917182107e-5])

stochastic_guy · November 6, 2020, 7:05am

My all_patterns is a matrix of binary values (0 & 1) whereas in your code it’s not the case
1.1 While running you code with binary matrix, I am getting the error
ArgumentError: Dirichlet: alpha must be a positive vector.
Dirichlet takes a vector (alphas) as an argument. In your code I think all_patterns are being passed as the alphas. I am not sure though.

nilshg · November 6, 2020, 2:24pm

Did you see my answer on Slack? It would be helpful if you could turn your example into an MWE that people can run, rather than trying to guess what all_patterns is (as K and VectorOfArrays are undefined).

I also don’t understand why in your OP you seem to want to pass a vector of alphas, given that that’s the only parameter of the Dirichlet what are you looking to estimate?

stochastic_guy · November 6, 2020, 4:40pm

Sorry I missed your response in slack. I just saw your reply.

VectorOfArrays come from here:

Alphas are the priors to the Dirichlet Distribution.
K should take +ve integer value (User can decide what value to take)

nilshg · November 6, 2020, 7:14pm

Thanks, that clarifies things a little bit. I still don’t fully understand what you’re trying to do as I haven’t thought through the dimensions of your problem, but it now seems to me that when you say “prior” you mean an initial value for alpha (rather than a prior in the Bayesian sense). In that case you probably need the init keyword:

using Distributions, RecursiveArrayTools

K = 3

all_patterns = (p=(a,c)->c<2 ? a : [[x;y] for x=a for y=p(a,c-1)])(0.0:1.0,K)
all_patterns = convert(Array,VectorOfArray(all_patterns)')

fit_mle(Dirichlet, all_patterns, init = [1. for _ ∈ 1:8])

Note that here I used an init vector of length 8, as your all_patterns matrix ends up being 8x3 - that’s what I meant by I haven’t thought through the dimensions of your problem, it seems from your initial post that you might want 3x8 here?

ChrisRackauckas · November 7, 2020, 9:53pm

You could just write a call to Optim.jl to optimize the logpdf.

Topic		Replies	Views
Quantiles Dirichlet distribution General Usage statistics , distributions	1	669	May 10, 2021
Plot Dirichlet distributions Visualization plotting , makie	1	381	October 20, 2023
Multivariate dirichlet mixture with Turing Probabilistic Programming turing	6	1224	May 16, 2020
Problem with fitting Beta distribution General Usage question , package , distributions	1	740	July 15, 2021
Turing Mixture Models with Dirichlet weightings Probabilistic Programming turing	8	420	April 25, 2024

How to fit a Dirichlet Distribution using MLE in Distributions.jl?

Related topics