ComplexityMeasures.jl (Entropies.jl successor)
I’m incredibly proud to announce ComplexityMeasures.jl, which I believe is one of the most well-thought out packages in JuliaDynamics and one of the most well-thought out packages in the whole of nonlinear dynamics. (wow big statements!)
https://juliadynamics.github.io/ComplexityMeasures.jl/stable/
Intro
ComplexityMeasures.jl contains estimators for probabilities, entropies, and other complexity measures derived from observations in the context of nonlinear dynamics and complex systems. It is the successor of the previous Entropies.jl package (which was never formally announced). We believe that ComplexityMeasures.jl is the “best” (most featureful, most extendable, most tested, fastest) open source code base for computing entropies and/or complexity measures out there. We won’t offer concrete proof for this statement yet, but we are writing a paper on it, and once we have a preprint I will link it here.
Content
ComplexityMeasures.jl is a practical attempt at unifying the concepts of probabilities, entropies, and complexity measures. We (@kahaaga and @datseris) have spent several months designing a composable, modular, extendable interface that is capable of computing as many different variants of “entropy” or “complexity” as one can find in the literature.
The package first defines a generic interface for estimating probabilities out of input data. Each probability (defined by a ProbabilitiesEstimator
subtype) also defines an outcome space, and functions exist to compute the probabilities and their outcomes, as well as other convenience calculations like the size of the outcome space or the missing outcomes. There are already a plethora of probabilities estimators:
Estimator | Principle | Input data |
---|---|---|
CountOccurrences |
Count of unique elements | Any |
ValueHistogram |
Binning (histogram) |
Vector , StateSpaceSet
|
TransferOperator |
Binning (transfer operator) |
Vector , StateSpaceSet
|
NaiveKernel |
Kernel density estimation | StateSpaceSet |
SymbolicPermutation |
Ordinal patterns |
Vector , StateSpaceSet
|
SymbolicWeightedPermutation |
Ordinal patterns |
Vector , StateSpaceSet
|
SymbolicAmplitudeAwarePermutation |
Ordinal patterns |
Vector , StateSpaceSet
|
SpatialSymbolicPermutation |
Ordinal patterns in space | Array |
Dispersion |
Dispersion patterns | Vector |
SpatialDispersion |
Dispersion patterns in space | Array |
Diversity |
Cosine similarity | Vector |
WaveletOverlap |
Wavelet transform | Vector |
PowerSpectrum |
Fourier transform | Vector |
An intermediate representation to some probabilities estimators are the Encodings, that encode elements of input data into the positive integers. Encodings allow for a large amount of code reuse as well as more possible output measures from the same code.
These probabilities can be used to compute an arbitrary number of entropies already defined in the library. The entropies themselves support an interface for different entropy estimators. It turns out, defining an entropy is one thing, but to estimate it there may be several ways. There are also a bunch of entropy definitions: Shannon, Renyi, Tsallis, Kaniadakis, Curado, StretchedExponential.
The package also has a generic interface for computing differential entropies instead of discrete ones.
On top of all that there is one more path: to compute “complexity measures”, quantities related to entropies but not entropies in the formal mathematical sense.
Content in numbers
- 78 ways of estimating discrete entropies of various kinds. 65 of these can be normalized. In total: 143 different quantifiers of discrete entropy.
- 11 ways of estimating differential Shannon entropy.
- 4 different complexity measure (e.g. sample entropy or approximate entropy).
If counting everything above, there are 158 different complexity measures available out of the box.
Interface design
Perhaps the biggest victory of this package, which has never been done by any other similar code base about computing entropy-related quantities, is its design.:
- New probabilities estimators can be added by extending a couple of simple functions (see dev docs).
- By extending these functions, one immediately gains access to a bunch of other functions for computing discrete entropies and complexity measures (e.g. missing dispersion patterns).
Closing remarks
There’s more functionality in progress, like the multiscale API, which will give access to multiscale variants of all the discrete measures, some of which have been explored in the literature, and most of them not!
We sincerely believe this package will accelerate scientific research that uses complexity measures to classify or analyze timeseries, and we welcome feature requests and pull requests on the GitHub repo!