Distributions.jl: is it possible to define a Normal+Dirac mixture distribution?

Vincent_Picaud · April 13, 2022, 11:20am

I want to manipulate a distribution mixture of the form

\pi\delta_{x_0}+(1-\pi)\mathcal{N}(\mu,\sigma)

I have checked that I can create a mixture of Normal & Cauchy, by example:

julia> using Disributions.jl

julia> MixtureModel([Normal(),Cauchy()])
MixtureModel{Distribution{Univariate, Continuous}}(K = 2)
components[1] (prior = 0.5000): Normal{Float64}(μ=0.0, σ=1.0)
components[2] (prior = 0.5000): Cauchy{Float64}(μ=0.0, σ=1.0)

however, when I try to define a Normal + Dirac mixture, I get some errors :

julia> MixtureModel([Normal(),Dirac(0.0)])
ERROR: MethodError: no method matching value_support(::Type{UnivariateDistribution})
Closest candidates are:
  value_support(::Type{<:Distribution{VF, VS}}) where {VF, VS} at ~/.julia/packages/Distributions/O4ZJg/src/common.jl:147
  value_support(::Type{<:Sampleable{<:VariateForm, VS}}) where VS at ~/.julia/packages/Distributions/O4ZJg/src/common.jl:47
Stacktrace:
 [1] MixtureModel(components::Vector{UnivariateDistribution}, prior::Categorical{Float64, Vector{Float64}})
   @ Distributions ~/.julia/packages/Distributions/O4ZJg/src/mixtures/mixturemodel.jl:134
 [2] MixtureModel(components::Vector{UnivariateDistribution})
   @ Distributions ~/.julia/packages/Distributions/O4ZJg/src/mixtures/mixturemodel.jl:117
 [3] top-level scope
   @ REPL[16]:1

My question : is such mixtures supported by Distributions.jl ? If yes how to define such distributions ?

Thank you!

skleinbo · April 13, 2022, 2:40pm

A mixture of a discrete (Dirac) and continuous (Normal) distribution is not defined. Assume one could. What would pdf(dist, x) return?

One might approximate the “continuous” Dirac distribution \int \delta(x) dx= 1 by a very narrow Gaussian, but that will probably make trouble if one needs to evaluate integrals.

Vincent_Picaud · April 13, 2022, 3:00pm

Thanks for the answer, yes I do agree this distribution has no density function. However, this is still a probability measure, sum of the Dirac measure and the Normal pdf using de Lebesgue one.

Thank you for your suggestion, that’s a good idea. However I cannot use it my context. What I wanted to do:

I have a sample vector Vector{Union{Missing,Float64}}, I wanted to modelize by:

\pi \delta_{\text{missing}} + (1-\pi)\mathcal{N}(\mu,\sigma)

where \pi is the proportion of missing values and \mu, \sigma estimated from available sample.

I do agree that this example is a little weird, as the law does not have a pdf. That is certainly the reason why this is not allowed. Again thanks for your answer underlying this fact.

skleinbo · April 13, 2022, 3:27pm

Don’t mark my comment as a solution yet. Maybe someone else has an idea on how to model this.

I agree that it’s a measure, but you’d need information on which mixture component was sampled to make sense of pdf etc.

There is also Home · MeasureTheory.jl but I haven’t used it.

Vincent_Picaud · April 13, 2022, 3:54pm

Thanks for the suggestion. But my example is still weird. The MeasureTheory.jl package seems to only mention Lebesgue measure.
I will write the few lines of Julia code I need for my task without relying on an extra package, using

p(A) = \pi\delta_{x_0}(A)+(1-\pi)\int_A f(x)\lambda(dx)

The good point to observe was the lack of pdf and you catch it, so I will accept it as the solution.

sethaxen · April 13, 2022, 4:51pm

Every density function is defined wrt a base measure, so it’s not enough to say whether the density function exists or not. We must define the base measure. The reason this fails in Distributions is that every distribution has an implicit base measure (absolutely continuous wrt the Lebesgue measure for Normal and absolutely continuous wrt the counting measure for Dirac), and MixtureModel requires these be the same measure.

But we can also define a mixture measure that is a mixture of different measures and use that as a base measure for a mixture model, which allows for mixing continuous and discrete components. This is the implicit base measure of Censored. See also https://twitter.com/sethaxen/status/1488314028803997698?t=YdU1ZRQXaiCROopv8l53eg&s=19.

So the best you can probably do now is use a narrow Normal but this will have problems. There are some Distributions issues tracking allowing MixtureModel to mix measures and in general supporting Distributions with atoms, and I can share some of those later when I’m at a computer.

Vincent_Picaud · April 13, 2022, 5:04pm

Thank you @sethaxen for this clarification, that is very clear and interesting. I will read your pointers.

Unfortunately, I do not think I can use the narrow Normal trick as my sample are in Union{Missing, Float64} and I use \delta_\text{missing} to take into account missing values. To use the narrow Normal trick I would have to define something like [\text{missing}-\epsilon,\text{missing}+\epsilon] which I can’t.

Topic		Replies	Views
How to create MixtureModel using distributions from different families Statistics	1	443	July 30, 2020
Package for Mixed Discrete Continuous Joint Distributions General Usage distributions , probability	0	37	January 21, 2025
[ANN] MixedDistributions.jl to manipulate mixed discrete-continuous variables Statistics announcement	0	507	August 23, 2018
Type instability of pdf of MixtureModel Statistics question	2	373	September 12, 2021
Define a distribution from a given distribution Probabilistic Programming distributions	12	1172	October 16, 2020

Distributions.jl: is it possible to define a Normal+Dirac mixture distribution?

Related topics