# How to fit a Dirichlet Distribution using MLE in Distributions.jl?

I would like to fit Dirichlet Distribution to `all_patterns` using MLE in the Distributions package
where alpha to Dirichlet is `fill(1,k)` and `all_patterns` is a combination array

``````all_patterns = (p=(a,c)->c<2 ? a : [[x;y] for x=a for y=p(a,c-1)])(0:1,K)
all_patterns = convert(Array,VectorOfArray(all_patterns)')
``````

and this is what I tried [but not working ]

`fit_mle(Dirichlet,fill(1,K),all_patterns)`

using the modified interface specified here: https://juliastats.org/Distributions.jl/stable/fit/

Do you need to use floating-point numbers? The code below works:

``````using Distributions

K = 3
all_patterns = (p=(a,c)->c<2 ? a : [[x;y] for x=a for y=p(a,c-1)])(1:2,K)
all_patterns = hcat(all_patterns...)

julia> fit(Dirichlet, Float64.(all_patterns))
Dirichlet{Float64}(alpha=[1.677728917182107e-5, 1.677728917182107e-5, 1.677728917182107e-5])
``````
1. My `all_patterns` is a matrix of binary values (0 & 1) whereas in your code itās not the case
1.1 While running you code with binary matrix, I am getting the error
`ArgumentError: Dirichlet: alpha must be a positive vector.`

2. Dirichlet takes a vector (alphas) as an argument. In your code I think `all_patterns` are being passed as the alphas. I am not sure though.

Did you see my answer on Slack? It would be helpful if you could turn your example into an MWE that people can run, rather than trying to guess what `all_patterns` is (as `K` and `VectorOfArrays` are undefined).

I also donāt understand why in your OP you seem to want to pass a vector of alphas, given that thatās the only parameter of the Dirichlet what are you looking to estimate?

• VectorOfArrays come from here:
• Alphas are the priors to the Dirichlet Distribution.
• K should take +ve integer value (User can decide what value to take)

Thanks, that clarifies things a little bit. I still donāt fully understand what youāre trying to do as I havenāt thought through the dimensions of your problem, but it now seems to me that when you say āpriorā you mean an initial value for alpha (rather than a prior in the Bayesian sense). In that case you probably need the `init` keyword:

``````using Distributions, RecursiveArrayTools

K = 3

all_patterns = (p=(a,c)->c<2 ? a : [[x;y] for x=a for y=p(a,c-1)])(0.0:1.0,K)
all_patterns = convert(Array,VectorOfArray(all_patterns)')

fit_mle(Dirichlet, all_patterns, init = [1. for _ ā 1:8])
``````

Note that here I used an init vector of length 8, as your `all_patterns` matrix ends up being 8x3 - thatās what I meant by I havenāt thought through the dimensions of your problem, it seems from your initial post that you might want 3x8 here?

1 Like

You could just write a call to Optim.jl to optimize the logpdf.