 # How to write a custom distribution using pdf from a Kernel Density Estimator?

Let

`x = rand(100); #some data`

PDF of `x` can be estimated using KDE:

``````using KernelDensity
KDE_x = kde(x)
ik = InterpKDE(KDE_x)
KDE_pdf(x) = pdf(ik, x)
``````

How can the `KDE_pdf` function be used in to construct a custom distribution?

Here’s my partial first try, still have to implement `rand` for this distribution:

I want to use the `KDEDist` in a Turing.jl model

``````using Distributions, KernelDensity

struct KDEDist <: ContinuousUnivariateDistribution
data::Vector{Float64}
end

function Distributions.pdf(d::KDEDist, x::Real)
KDE_fit = kde(d.data)
ik = InterpKDE(KDE_fit)
return pdf(ik, x)
end
``````

As we can see that when the `pdf` function is called the KDE is fitted every time which is wasteful. Is there a way to just the pdf(ik, x) function into the `Distributions.pdf` function to avoid wasteful computation?

Do the KDE fit when constructing the struct.

1 Like

Also consider doing the same for multivariate distributions using GitHub - noilreed/MultiKDE.jl: Multivariate kernel density estimation. Also, this would be a valuable package 1 Like

I don’t know how to KDE fit inside struct.

Instead passed the KDE_pdf into struct like this:

``````struct KDEDist <: ContinuousUnivariateDistribution
KDE_pdf::Function
h::Float64 #used for rand
end

function Distributions.pdf(d::KDEDist, x::Real)
return KDE_pdf(x)
end

dist = KDEDist(KDE_pdf, 1.0)
pdf(dist, 10)
``````

Something like this

``````using Distributions, KernelDensity

struct KDEDist{D, K} <: ContinuousUnivariateDistribution
data::D
kde::K
end
function KDEDist(data)
KDE_fit = kde(d.data)
ik = InterpKDE(KDE_fit)
return KDEDist(data, ik)
end

function Distributions.pdf(d::KDEDist, x::Real)
return pdf(d.kde, x)
end
``````
1 Like