Ok, let’s collect what we want interfaces for.
What sparips wants interfaces for:
- distances: Using
Distances.Metric
seems fine. - Sparse filtered graphs: Each vertex and each edge has a filtration value. Maybe use something from the LightGraphs universe?
- Tolerances: For approximations, we need error tolerances. A tolerance is an interleaving; basically a nondecreasing function
psi
withpsi(r) <= r
; but we probably want to require the inverse ofpsi
as well, and potentially some other transforms. Potentially, we would permit two functions (in sparips, the reverse direction of psi is always the identity). The interface would spec some functions, and assumed properties. - Persistence diagrams / barcodes: Basically an array of arrays of tuples (dimension, list of classes, birth, death). Also, optional metadata on each class (solver statistics, representative cycles, cocycles) and metadata on the entire diagram (e.g. the field in which it was computed).
- Approximate persistence diagrams: A persistence diagram plus a tolerance. This is important for plotting and for filtering artifacts coming from an approximation.
How would I restructure my code:
- excise ripser bindings: in goes filtered graph, out goes persistence diagram.
- excise plotting: in goes persistence diagram and optional tolerance, out goes plot.
- remaining in sparips.jl: in goes metric, (opaque) data and tolerance, out goes sparse filtered graph.
- new (trivial): In goes metric and (opaque) data; out goes dense filtered graph. (effectively distance matrix)
- new: some bindings for gudhi. Gudhi provides an alternative sparsification, with relative error tolerances. Gudhi also provides solvers.
- new: Some statistics on filtered graphs (to measure how sparse they are, and how bad their rips complex will be).
Something that is very questionable: Input is filtered graph, output is filtered simplicial complex (clique complex / rips complex / induced flag-complex) up to some dimension. Reason for that: too much memory consumption. If you know that you are flag, then it should be an internal decision of the solver whether to ever commit the complex / its boundary matrix to memory (imo a bad decision), or recompute it on-the-fly (as ripser mostly does).
Extra things that are relevant: Many sparsification approaches can also produce a smaller zig-zag filtration that is homotopic to the original one. We should permit space for that in the filtered graph interface, in order to accomodate future solvers that can make use of this structure. As far as I know, all these things work via just reducing the number of points. So each vertex gets not just a filtration time filt_time(g, v)
(such that v
is not present at smaller times), but also an optional zig_zag_time(g, v)
, that asserts that v
can be deformed away at larger times. Hence, v
is only present in the zig-zag if zig_zag_time(g,v)<t<filt_time(g,v)
. Also, an optional zig_zag_parent(g,v, t)
, for t>zig_zag_time(g,v)
, that asserts that mapping v->parent
is a deformation at time t
. (obviously this needs some bike-shedding)
What are the interfaces you would like to have?
Do you want to support non-flag simplicial complexes?