[ANN] StableDistributions.jl: stable distributions in Julia

I am happy to announce that Julia now has a feature-complete stable distributions package (see Wikipedia) StableDistributions.jl. This class of distributions is crucially important when dealing with data exhibiting heavy, power law tails. These tools were already available in NumPy, Matlab and R, now they are also in Julia.

This package fully complies with Distributions.jl interface:

  • distribution variables have Stable <: Distribution,
  • for generation there is rand,
  • you can fit data with fit, which uses ecf fitting technique, this is considered to be the best contemporary method next maximum likelihood estimation (which is much, much more computationally demanding),
  • for moments mean, 'var`, etc. (these are often infinite for this class),
  • probability density pdf, similarly cdf, mgf, cf, quantile, etc,
  • affine transformations of stable distributions are available, you can multiply them and add constants,
  • two stable distribution with the same stability index can be convolved with convolve.

This started as a PR to add those features to Distributions.jl, but there was no response, it seems the maintainers are too busy. Merging it in the future would be a good idea, but the package still would have raison d’être, as there are some advanced features which should be tried out elsewhere before possibly moving them to Distributions.

In the future adding the maximum likelihood estimation and some classes of multivariate stable distributions (mainly elliptic and discrete measure ones) would be useful.

46 Likes

Many thanks for the effort, much easier to use the new package than the PR.

Encore bravo for the good work.

1 Like

Great stuff! How does it compare with GitHub - org-arl/AlphaStableDistributions.jl: Alpha stable and sub-Gaussian distributions in Julia ?

I see it has less dependencies which is good :slight_smile:

AlphaStableDistributions.jl is a very small package which ports some functionality from arply. It just has 2 methods: rand and fit. Moreover their fit method is McCulloch quantile method, which nowadays should not be used as a primary estimation tool.

This package is much more extensive and was written from scratch using modernised algorithms, mostly from John P. Nolan 2020 book.

What is currently not available here and is there are multivariate elliptic stable distributions (they call them “subgaussian”, this name appears in many places, but is confusing). I know how to implement at least basic elliptic stable functionality, I will add it in few months.

5 Likes

Thanks for the review ! Since I did not like their dependencies too much I actually moved a bit of their code (the only bit that I use) inside my package there The only call to it is on the bottom of this file.

Would you help me translate this to use your package ? I think I simply need to remove this file and change the call to the distribution to match your definitions of the parameters, but i’d like to have your oppinon

Both libraries use Distributions.jl interface, so this is very similar. If I understand it correctly at that line you make stable distributions variable, and somewhere deeper in the code corresponding rand is called. Check if just switching AlphaStable call to

Stable(1/G.θ, 1,cos(π/(2G.θ))^G.θ, (G.θ == 1 ? 1 : 0) )

works.

@Tbl please can I get the learning resources on stable distributions? I want to contribute to the package before the end of October. Thanks a lot.

I don’t know what you need. Elementary information is well-written on Wikipedia. Properties of univariate stable distributions are described in detail in Nolan’s book. For the rest it depends.

Thanks for the package! I have a question about the possibility of constrained parameter fitting.

When I tried this package with real data, I found

Fitted Parameters for path_length:

α (stability): 0.9092045908319368
β (skewness):  1.0
σ (scale):     0.05121054252961137
μ (location):  -0.34840504748351264

The location parameter is negative. However, I may wish to enforce that μ is a nonnegative parameter. Is it possible to perform a constrained fitting, and how can I achieve that?

Just extract the likelyhood and maximize it but using parameter r = \sqrt{\mu} ?

This package uses the specialised characteristic function regression fit. For such method the constrained fitting is just the fit cropped to the desired interval, no way around this.

I was thinking about implementing maximum likelihood too, however it is hard for stable variables due to the fact that pdfs are only available numerically and even calculating them is demanding. And, for what I know from literature, its effectiveness is comparable to the method here.

Thanks for the explanation! I did try the maximum likelihood estimation a bit, but for some reasons I did not get a converged solution:

using Distributions
using Optim, ComponentArrays
using StableDistributions
using Printf

# Objective function for the constrained stable fit.
# It transforms unconstrained optimization variables to the valid parameter space.
function constrained_stable_negative_log_likelihood(params, data)
   # Transform parameters from unconstrained space to the constrained space
   # α is in (0, 2]
   α = 2.0 / (1.0 + exp(-params.trans_a))
   # β is in [-1, 1]
   β = tanh(params.trans_b)
   # σ > 0
   σ = exp(params.log_s)
   # μ >= 0
   μ = exp(params.log_m)

   dist_to_test = Stable(α, β, σ, μ)
   ll = sum(logpdf.(dist_to_test, data))
   return isfinite(ll) ? -ll : Inf
end

# This function fits a constrained stable distribution using Optim.jl.
function fit_constrained_stable_distribution(data::Vector{<:Real})
   # Initial guesses for the *transformed* parameters.
   fitted_dist_init = fit(Stable, data)
   initial_params = ComponentArray(
      trans_a = -log(2/fitted_dist_init.α - 1),
      trans_b = atanh(fitted_dist_init.β),
      log_s = log(1.0),
      log_m = log(1.0)
   )
   objective_function = p -> constrained_stable_negative_log_likelihood(p, data)
   result = optimize(objective_function, initial_params, BFGS(),
      Optim.Options(g_tol = 1e-8, iterations = 1_000, show_trace = false))

   if !Optim.converged(result)
      @warn "Constrained Stable optimization failed to converge."
      println(result)
      return nothing
   end

   p = Optim.minimizer(result)
   return (
      α = 2.0 / (1.0 + exp(-p.trans_a)),
      β = tanh(p.trans_b),
      σ = exp(p.log_s),
      μ = exp(p.log_m)
   )
end

dist = Stable(1.5, 0, 2, 3)
data = rand(dist, 2000)

fitted_params = fit_constrained_stable_distribution(data)

println("Fitted Parameters for Scaled Data (Constrained Stable):")
@printf "α=%.4f, β=%.4f, σ=%.4f, μ=%.4f\n" fitted_params.α fitted_params.β fitted_params.σ fitted_params.μ
println("-"^40, "\n")
┌ Warning: Constrained Stable optimization failed to converge.
└ @ Main test_stable_distribution_MLE_fit.jl:39
 * Status: failure

 * Candidate solution
    Final objective value:     Inf

 * Found with
    Algorithm:     BFGS

 * Convergence measures
    |x - x'|               = 0.00e+00 ≤ 0.0e+00
    |x - x'|/|x'|          = 0.00e+00 ≤ 0.0e+00
    |f(x) - f(x')|         = NaN ≰ 0.0e+00
    |f(x) - f(x')|/|f(x')| = NaN ≰ 0.0e+00
    |g(x)|                 = NaN ≰ 1.0e-08

 * Work counters
    Seconds run:   0  (vs limit Inf)
    Iterations:    0
    f(x) calls:    1
    ∇f(x) calls:   1

Maybe I used the wrong equations or a wrong method?

I went through literature before and I know this is a hard optimisation problem. People made it work, but it requires specialised methods.

1 Like