For density estimation, I know that there are various kernel density estimation packages (KernelDensity.jl, KernelDensityEstimate.jl, KernelEstimator.jl), but I wonder if there are any spline-based packages? The goal here is to find a package that can:
- produce decent density estimates, which are well-behaved at the edges of bounds
- is fast for thousands / tens of thousands of samples.
For example, there is a nice plospline package for R which has the logspline
function, which is fast and well-behaved at boundaries.
Does anyone know of any such spline based density estimation packages? Or any other suggestions?
5 Likes
If anyone was interested in implementing it, then this is the place to start
Kooperberg, C., & Stone, C. J. (1992). Logspline density estimation for censored data. Journal of Computational and Graphical Statistics , 1 (4), 301-328
but the current implementation seems to be based on a this later paper
Stone, C. J., Hansen, M. H., Kooperberg, C., & Truong, Y. K. (1997). Polynomial splines and their tensor products in extended linear modeling: 1994 Wald memorial lecture. Annals of statistics , 25 (4), 1371-1470.
I had a look at the logspline R package code to see if it would be a difficult task. It looks like the R functions are just wrapper functions which call C code. So there’s zero chance for me to be able to port that to be honest. And the mathematical and algorithmic details in the papers are beyond me.
Maybe someone here would be up for the challenge or porting to Julia?
3 Likes