I registered 3 packages useful in data analysis (See also jean-pierre both · GitLab)
-
HnswAnn
This package provides an interface to a Rust crate implementing the Hnsw algorithm for approximate nearest neighbours. Installation of the Rust packafe is easy and documented. -
RandomProjectionTree (based on a paper of Freund DasGupta)
-
PartialSvdStoch
This package implements an algorithm of O. Shamir and increases precision of truncated svd algorithms. It builds upon the LowRankApprox and provides also an algorithm of Vempala for incremental svd approximation.
All three packages are implemented with care for performance (multithreading, simd acceleration for the rust part, direct use of BLAS)