Distributions.jl: is it possible to define a Normal+Dirac mixture distribution?

I want to manipulate a distribution mixture of the form

\pi\delta_{x_0}+(1-\pi)\mathcal{N}(\mu,\sigma)

I have checked that I can create a mixture of Normal & Cauchy, by example:

julia> using Disributions.jl

julia> MixtureModel([Normal(),Cauchy()])
MixtureModel{Distribution{Univariate, Continuous}}(K = 2)
components[1] (prior = 0.5000): Normal{Float64}(μ=0.0, σ=1.0)
components[2] (prior = 0.5000): Cauchy{Float64}(μ=0.0, σ=1.0)

however, when I try to define a Normal + Dirac mixture, I get some errors :

julia> MixtureModel([Normal(),Dirac(0.0)])
ERROR: MethodError: no method matching value_support(::Type{UnivariateDistribution})
Closest candidates are:
  value_support(::Type{<:Distribution{VF, VS}}) where {VF, VS} at ~/.julia/packages/Distributions/O4ZJg/src/common.jl:147
  value_support(::Type{<:Sampleable{<:VariateForm, VS}}) where VS at ~/.julia/packages/Distributions/O4ZJg/src/common.jl:47
Stacktrace:
 [1] MixtureModel(components::Vector{UnivariateDistribution}, prior::Categorical{Float64, Vector{Float64}})
   @ Distributions ~/.julia/packages/Distributions/O4ZJg/src/mixtures/mixturemodel.jl:134
 [2] MixtureModel(components::Vector{UnivariateDistribution})
   @ Distributions ~/.julia/packages/Distributions/O4ZJg/src/mixtures/mixturemodel.jl:117
 [3] top-level scope
   @ REPL[16]:1

My question : is such mixtures supported by Distributions.jl ? If yes how to define such distributions ?

Thank you!

A mixture of a discrete (Dirac) and continuous (Normal) distribution is not defined. Assume one could. What would pdf(dist, x) return?

One might approximate the “continuous” Dirac distribution \int \delta(x) dx= 1 by a very narrow Gaussian, but that will probably make trouble if one needs to evaluate integrals.

1 Like

Thanks for the answer, yes I do agree this distribution has no density function. However, this is still a probability measure, sum of the Dirac measure and the Normal pdf using de Lebesgue one.

Thank you for your suggestion, that’s a good idea. However I cannot use it my context. What I wanted to do:

I have a sample vector Vector{Union{Missing,Float64}}, I wanted to modelize by:

\pi \delta_{\text{missing}} + (1-\pi)\mathcal{N}(\mu,\sigma)

where \pi is the proportion of missing values and \mu, \sigma estimated from available sample.

I do agree that this example is a little weird, as the law does not have a pdf. That is certainly the reason why this is not allowed. Again thanks for your answer underlying this fact.

Don’t mark my comment as a solution yet. Maybe someone else has an idea on how to model this.

I agree that it’s a measure, but you’d need information on which mixture component was sampled to make sense of pdf etc.

There is also Home · MeasureTheory.jl but I haven’t used it.

1 Like

Thanks for the suggestion. But my example is still weird. The MeasureTheory.jl package seems to only mention Lebesgue measure.
I will write the few lines of Julia code I need for my task without relying on an extra package, using

p(A) = \pi\delta_{x_0}(A)+(1-\pi)\int_A f(x)\lambda(dx)

The good point to observe was the lack of pdf and you catch it, so I will accept it as the solution.

Every density function is defined wrt a base measure, so it’s not enough to say whether the density function exists or not. We must define the base measure. The reason this fails in Distributions is that every distribution has an implicit base measure (absolutely continuous wrt the Lebesgue measure for Normal and absolutely continuous wrt the counting measure for Dirac), and MixtureModel requires these be the same measure.

But we can also define a mixture measure that is a mixture of different measures and use that as a base measure for a mixture model, which allows for mixing continuous and discrete components. This is the implicit base measure of Censored. See also https://twitter.com/sethaxen/status/1488314028803997698?t=YdU1ZRQXaiCROopv8l53eg&s=19.

So the best you can probably do now is use a narrow Normal but this will have problems. There are some Distributions issues tracking allowing MixtureModel to mix measures and in general supporting Distributions with atoms, and I can share some of those later when I’m at a computer.

3 Likes

Thank you @sethaxen for this clarification, that is very clear and interesting. I will read your pointers.

Unfortunately, I do not think I can use the narrow Normal trick as my sample are in Union{Missing, Float64} and I use \delta_\text{missing} to take into account missing values. To use the narrow Normal trick I would have to define something like [\text{missing}-\epsilon,\text{missing}+\epsilon] which I can’t.