Package for Non-Parametric Multivariate Discrete Distributions

I’m looking for a package that implements tools for working with non-parametric multivariate discrete distributions. Ideally, this would look something like a Multivariate version of DiscreteNonParametric in Distributions.jl. I’m sure this wouldn’t be a big lift to implement, but I don’t want to reinvent the wheel if it’s already been done.

If it is just for sampling, Distributions.jl contains utilities to build product distributions. However, I’m not sure they necessarily support parameter fitting

Unfortunately every joint distribution can’t just be represented as a product.

E.g.

using LinearAlgebra
a = normalize(rand(3,4), 1)
a1 = sum(a, dims=1)
a2 = sum(a, dims=2)
anew = a2 * a1
anew ≈ a  #false

In this case anew is a valid joint distribution equal to the product of the marginals of a, but it is not equal to a.

1 Like

What kind of representation do you need? If you don’t mind an “extended” representation (as in, every tuple has its own probability mass and we disregard the connections between said masses), you can always encode all your tuples as integers, even though it’s a hassle.
On the other hand, if you want a “compact” representation, where the dependencies between probability masses are accounted for (typically this would make sense for your example above), you may want to look for libraries to handle probabilistic graphical models.