Data type for a “stupid” binary search tree

I need a data type for a collection of objects (stored in a given order) which supports both O(log n) insertion and deletion methods, as well as some way to perform a dichotomic search (where I provide the left-or-right test): namely, given a function cmp, where I promise that map(cmp, collection) is of the form (false, false, false..., true, true, true...), find the handle for the first object x such that cmp(x) == true.

In addition, I also need some way of performing sequential traversal (for swapping objects in the collection as needed to preserve the above promise).

In other words, I need this structure to behave like a Vector, but with efficient insertion and deletion (at the price of O(log n) lookup). This is not a sorted container; namely, the data in the container is indeed sorted according to some weird ordering, but I do not need the container to know anything about this order (all the comparisons are performed by the external cmp function at insertion/deletion time).

This should possible in theory by using a self-balancing binary tree; DataStructures.jl has several such types (e.g. BalancedTree23, and recent versions export RBTree), but they apparently all use comparison functions between the stored values, which is what I want to avoid. Does there exist a data structure implementing this, or must I write it from scratch?

(For more context: my goal is to implement a variant of the Bentley-Ottmann algorithm. The data structure stores a set of segments, represented as pairs of indices of points, and sorted by increasing intercept along a given x = constant line. The comparison between those segments is not done by computing the intercepts, because this would need to be redone each time a new line is chosen; instead, the correct insertion point is chosen for each new segment by computing determinants. Whenever the data becomes incorrectly-sorted in the list, this is detected and they are manually swapped).

this function only has one argument?

The one-argument form is of course an abstraction of x->cmp(some_reference_value, x).

I’m not sure if I understand but maybe GitHub - andyferris/AcceleratedArrays.jl: Arrays with acceleration indices is relevant

1 Like

Anyway, the description is very complicated. But I would just use existing trees structures and just overload some methods using MD

using DataStructures
tree = RBTree{Int}();

push!.(Ref(tree), 1:20)

import Base

Base.iterate(tree::RBTree) = Base.iterate(tree, 1)

function Base.iterate(tree::RBTree, pos::Integer)
    @assert pos >= 1
    if pos <= length(tree)
        return tree[pos], pos+1
        return nothing

cmp(x) = x == 2

map(cmp, tree)

function Base.findfirst(cmp::Function, tree::RBTree)
    for i = 1:length(tree)
        if cmp(tree[i])
            return tree[i]

findfirst(cmp, tree)