I didn’t see it in the docs and am trying to identify what algorithm is used in the implementation of find
(and union
) in the DisjointSets
object in the DataStructures.jl
package.
find
The docs say that “path compression” is used for the find
operation, which is good enough for (optimal) O(\alpha(n)) complexity, but requires two traversals of a path from node to root. In contrast, the Wikipedia entry notes that path-halving or path-splitting can be used to remove one of the traversals and still retain the optimal complexity (and is fast in practice).
Two questions: (1) am I correct in my understanding that the current implementation does not utilize path-halving or path-splitting? (2) would it be worthwhile to have an implementation of path-compression plus path-halving and path-splitting find
operations?
union
It seems the current implementation does union-by-rank (per Wikipedia). By-rank seems to be more space-efficient than by-size.