[RFC] Taking directional/orientational statistics seriously

To my knowledge, there are no Julia packages with good support for directional and orientational statistics. Directional statistics is used in biology (especially structural biology), crystallography, astronomy, and various other physics applications including geophysics.

Distributions.jl implements VonMises for circular variables (angles) and VonMisesFisher for spherical variables (unit vectors), but that’s it. No orientational distributions are implemented.

From the discussion on slack, it seems like a package extending Distributions.jl and Statistics.jl to handle these kinds of distributions could be useful, and I’d like to 1) gauge interest in such a package, 2) present a rough design proposal for feedback, and 3) find potential contributors since I only need a subset of these features for my research.

Goals:

  • Implement common distributions/statistics in directional/orientational statistics
  • Compatible with common AD frameworks
  • Efficient fitting (where possible)
  • Usable within PPLs like Soss, Turing, and Gen
  • Realize directional statistics as a special case of statistics on Riemannian manifolds by using Manifolds.jl’s manifolds and interface
  • Stress test for Manifolds.jl distributions interface (mostly written by @mateuszbaran; see also Simplifying Distributions type hierarchy)

Covered Manifolds

These are the manifolds we will consider for defining distributions. Some are already implemented in Manifolds.jl.

Directional

  • Circle
  • Manifolds.Sphere
  • Hemisphere? (for points that represent axes)
  • Torus

Orientational

  • Manifolds.Stiefel (WIP by @kellertuer)
  • Manifolds.Rotations

Distributions

Some of these are sufficiently general that they might make their way into Manifolds.jl or a package like ManifoldsDistributions.jl.

Generic (have a Manifold type for specialization)

  • Dirac
  • Uniform
  • Normal analogs
    • ProjectedTangentNormal (normal in tangent space projected to manifold)
    • RiemannianNormal (normal in manifold)
    • IsotropicDiffusion (solution to heat equation)
  • Mixture

Directional

  • Circle
    • Wrapped{<:ContinuousUnivariateDistribution}
    • VonMises
  • Sphere
    • VonMisesFisher
    • Kent
  • Hemisphere
    • Watson
  • Torus
    • CircleProductDistribution (product of N Circles)
      • e.g. NvariateVonMises

Orientational

  • Stiefel
    • TupleDistribution{Stiefel}? (SphereProductDistribution restricted to orthonormal)
      • e.g. MatrixVonMisesFisher
  • Rotations
    • Comprise from Stiefel/Sphere

Statistics

  • mean resultant length
  • others…
16 Likes

Sounds like a great idea for a package!

Would love to learn more about these statistics in a package. :100:

I don know if you have any geospatial application in mind, but spherical coordinates is something on my TODO list. GeoStats.jl contains some directional statistics, but I think the definition there is different. For example, directional variograms and covariances.

1 Like

This sounds great! @mateuszbaran I’d love to hear more about your distribution interface for Manifolds.jl

1 Like

If you do start a specific package then I’ll point my much smarter colleagues towards it. I have only implemented some very rudimentary methods in the past so I can’t personally code much of the math. However, there’s a huge use for these in my field that I’d be excited to explore at some point.

3 Likes

I don’t know much about geospatial statistics, but I imagine there’s some overlap and room for collaboration. Ideally, we’d support multiple parameterizations of the distributions (Manifolds certainly allows for this), so spherical coordinates would be supported at some point. The proposed package would be general and as light-weight as possible, so it could be a useful backend.

1 Like

+1 for general and light weight. There are a lot of applications and a clean/easy user interface like we have in Distributions.jl could open up Julia to a lot of very important research.

2 Likes

I agree. There are limitations though. Distributions on my machine takes ~8s to load. Manifolds adds another 5s. They’re not heavyweight per se, but that’s still annoying especially as a backend for a package that might have many other dependencies and may only need one or two directional features.

I agree that can be an issue but that’s a problem that extends across many packages and I think you’ll drive yourself crazy trying to solve it while creating a new project. If you focus on maintainability, extensibility, and efficient algorithms then you’ll likely be good in the long run.

1 Like

Thanks for this summary and RFC, I would like to have such features (in Manifolds.jl or a seperate package) in order to have general random generators (from the distribution). I did this as a first apporach with randomMPoint for example and I would like to have that on arbitrary manifolds (though I am not a statistician); since one thing I want to consideris noisy measurements in manifold-valued optimization. On Stiefel this would be a Von Mises-Fisher, yes.

1 Like

Yes, off the top of my head, there are 3 main properties you want for a distribution on a manifold: 1) ability to efficiently compute the log density (within proportionality), 2) ability to draw exact samples, and 3) ability to invert that problem by fitting its parameters from random samples.

On manifolds, few distributions have all 3 of these properties worked out. von Mises-Fisher is an exception, which is why it’s so widely used. The normal analogs listed above are quite general and should ultimately be in Manifolds. Each of them satisfy a subset of these requirements, though for directional statistics we sometimes get lucky (for circles IIRC they all converge to the Wrapped{Normal}). In practice we want many generic distributions, and as properties like log densities are worked out for these distributions on specific manifolds, users can overload the corresponding methods.

1 Like

Love it! I hope I can help out.

1 Like

After discussing this on Slack with @kellertuer, we realized most of these cases can be realized as special cases of a few generic distributions we can define on arbitrary manifolds. This motivated this issue to get these implemented directly in Manifolds.jl: https://github.com/JuliaNLSolvers/Manifolds.jl/issues/57. It might still be worth it to have a special package that wraps these generic distributions for easier use and includes some special functions for e.g. MLE.

2 Likes

Thanks @sethaxen for the RFC. I won’t be able to help much in the near future but I’ll at least be here to discuss things. I have plans to work in this area and it’s great to see many people interested in manifolds and statistics :+1:.

Not sure if it’s related or makes sense, but biojulia has a structures package - would be nice if it would be possible to interact with that.

2 Likes

I’d be curious if or how some basic stochastic processes like spherical and circular Brownian motion fit into the picture (https://github.com/mschauer/Bridge.jl/)

1 Like

Because BioStructures and Manifolds are both designed to be base packages, I don’t see them directly interacting, but of course any package could mix their functionality. On a personal note, I’m a structural biologist, and my motivating applications to manifold statistics are for inferring macromolecular conformational ensembles; for these applications I plan to use both packages, though I’ve yet to work out the details.

By the way, thanks @jgreener64 for your great work on BioStructures.jl

3 Likes

This is something I’m keen to support. FWIW, my research uses Brownian motion distributions on the sphere, rotation group, and rigid body motion group, though they’re not the focus.

Above I mentioned IsotropicDiffusion in general and Wrapped specifically for the circle, which I intend to cover “isotropic Brownian motion” (solutions to the heat equation), potentially with shift and drift (brownian motion on the circle produces Wrapped Normal).

But this is all vague and hand-wavy right now because I don’t have a firm grasp on the theory or terminology, and I’m sure there are many senses of “Brownian” motion that I’m not accounting for. Thanks for the link to the package; I plan to study the documentation. Any additional resources you can provide to orient me would be useful.

Likewise I don’t see any immediate ways to link these two packages, but I will keep an eye out in the future for sure. Being Julia, no doubt someone will come along and do something cool with them. Glad you are enjoying BioStructures.jl!

just to bring on the stage the CircStat package that is quiet popular in neuroscience, I personally translated several functions to Julia for my limited usage, but would definitely help others who want directional statistics in Julia.