Hello everyone,
I am currently trying to calculate probability density curves using the KernelDensity.jl package, however I ran into some issues due to the nature of my data.
I am not a statistician so i will have to explain my problem in layman terms…
The data vector x
, I want to calculate the kde of, consist of positive integers including zero, (x_i ∈ Z ≥ 0) which has to be reflected in the density plot.
I know that KernelDensity.kde uses the Normal distribution by default, thus there is no positive support.
x = 0:1:1e4
KDE = kde(x,kernel=Normal)
plot(KDE.x,KDE.density)
In this uniform x case I can simply add this border argument to the kde call to get rid of any x < 0.
KDE = kde(x, boundary=(0,1e4), kernel = Normal)
However my real distribution of x is tailed,
x = round.(rand(LogNormal(3,1),1000))
KDE = kde(x, boundary = extrema(x), kernel = Normal)
and then the boundary argument just wraps the plot over zero.
I tried to adapt the underlying kernel_dist method with a truncated version of the Normal distribution, but that threw an error.
kernel_dist(::Type{Truncated},w::Real) = truncated(Normal(0.0,w);lower = 0)
KDE = kde(x, kernel = Truncated)
>ERROR: MethodError: no method matching Truncated(::Float64, ::Float64)
So my questions are 1: what am I doing wrong in adding my own kernel distribution; and 2: is this even the correct approach to the probability density?
I hope this question is somewhat clear and Thank you all in advance!