Best way to store a sparse, unbalanced, and not complete binary tree

I am implementing a binary tree where given the key of the root and a function the left and right children are defined up to a given depth that is an other parameter of the constructing function: create_tree!(tree, key, func, depth) . The most straightforward way in my opinion is to define tree as a Vector{Union{Int,Nothing} . This solution is clear, in my opinion. However in my case func is such that if depth=15 around 97% of the nodes in tree happen to be nothing . Since I would like to avoid wasting memory and I need even deeper trees I am asking: is there a good way to implement such a binary tree in Julia? To give more context I can also add that func is such that is more probable that the left child of a given node is nothing with respect to the right child. In a language like C or C++ I would use pointer based construction of a tree. I suppose I could define a Julia type that acts like a pointer but I am wondering if there is a easier and more straightforward solution.

Iā€™d probably use

mutable struct Node{T}
    value::T
    left::Union{Nothing, Node{T}}
    right::Union{Nothing, Node{T}}
end
const Tree{T} = Union{Nothing, Node{T}}
2 Likes

Agree with @goerchā€™ answer, and we can do away with Union{Nothing, Node{T}} by incomplete initialization, which makes it acts more like a pointer thing.

mutable struct TreeNode{T}
    value::T
    parent::TreeNode{T}
    left::TreeNode{T}
    right::TreeNode{T}
    function TreeNode{T}() where T
        n = new{T}()
        n.parent = n
        n.left = n
        n.right = n
        n
    end
    TreeNode{T}(v, p, l, r) where T = new{T}(v, p, l, r)
end

With this definition, some basic functions of the linked-binary-tree need to be redefined, such as isroot(n::TreeNode) = n === n.parent. I implemented a linked-quadtree type in my project. Things are similar. I hope it will help you.

3 Likes

Interesting. On one hand I can imagine that using undef that way could simplify type inference. OTOH you need to use isassigned(p) instead of p !== nothing and worse to reset p to undef youā€™d need something like unsafe_store! as @jakobnissen pointed out recently on slack.

Depends on the application, as always.

If you donā€™t need to modify the tree structure after construction, Iā€™m a big fan of the following layout. It is optimized for the case that you need to traverse left more often than right (following left typically hits the same cache-line that you already have loaded), so you may need to internally rename left-right. Only existing nodes are stored (no nothing stuff). There is no depth-bound, you simply have a bunch of nodes.

The structure is

struct Tree{T}
keys::Vector{T}
child_right::Vector{Int} #or Int32 or UInt32...
end

The order is:

[root, L, LL, ...., LR, LRL, LRLL, ..., R, RL, ...]

(in other words, attach to each node the ā€˜Lā€™/ā€˜Rā€™ string corresponding to the path in the tree to reach the node; then sort lexicographically; for each node, store as metadata the offset to (or position of, depending on what you find more convenient) string_label(node) * "R" if that node exists in the tree and 0 otherwise)

Subtrees are always sequential in memory and their size is known without further searching. Searching for keys is always accessing memory in sequential ascending order, unless youā€™re e.g. searching for a best-match. When the remaining subtree is small enough, you can seamlessly switch to linear search.

It is quite useful in practice to also include parent pointers for convenience.

Structure is also good for easily parallelized in-place construction or computing and storing tree-stuff for every node (spawn a task for the right subtree if size above threshold; due to the memory layout you very rarely have slowdowns due to different cores writing to the same cacheline).

PS. You must identify subtrees by the pair (subtree_root_index, subtree_upper_bound). The upper bound is computed during traversal from root. Otherwise you cannot distinguish the situations where a node has no children at all, or no right child but a nonempty left subtree.

If you include extra metadata (e.g. parent pointers, subtree_upper_bound, domain-specific info) then it is advisable to use something like

struct NodeMetadata
child_right::Int
#e.g. parent::Int
...
end

struct Tree{T}
data::Vector{T}
metadata::Vector{NodeMetadata}
end

for the sake of better memory access.

I advise against attempting to mix data and metadata ā€“ that opens a big can of structure padding worms, makes SIMD harder and makes it harder to use domain-specific implementation for linear search on small subtrees.

5 Likes

Is this the vine data structure representing a tree traversal?

Itā€™s original (in the sense that I once needed something like this and came up with it), but almost certainly not novel. Itā€™s the kind of stuff that is easier to reinvent or ask discourse / stackoverflow / ā€œoral communicationsā€ about than doing a literature search for.

If you happen to know a famous name for this layout, please share! A wiki link could save me some source code comments and/or pages of explanations in papers that use this.

The vine stuff you linked to looks different? The wiki talks about ā€œrebalancingā€, while I only gave a memory layout and metadata style for a given tree. How do build it depends on your domain ā€“ there are very good domain-specific reasons for unbalanced trees. E.g. if your datapoints have probability mass then you want your tree to be balanced with respect to mass, not with respect to number-of-points. If your datapoints live in a metric space, you may want to balance your tree by diameter, not number-of-points (that was my domain). Either way you end up with a tree that is horribly unbalanced for good reasons and that must not be rebalanced according to naive ā€œnumber of pointsā€.

Hence the above layout takes some pains to not be worse than a flat array (no linked-list heap-vomit with large overheads in case of completely degenerate tree, modest metadata overhead, etc).

1 Like

Can you provide a link to @jakobnissenā€™s proposal?
In fact, I initialized the nodeā€™s field with itself instead of leaving them undefined. I wonder if there is any performance difference in accessing uninitialized fields. Or is there always a protective check, whether it is initialized or not?

Here is @jakobnissen 's example

julia> A = ["hello"]
1-element Vector{String}:
 "hello"

julia> unsafe_store!(Ptr{UInt}(pointer(A, 1)), zero(UInt));

julia> A
1-element Vector{String}:
 #undef

I almost feel bad unleasing such shady code into the world! :sweat_smile:
You should make sure Union{String, Nothing} isnā€™t good enough first.

If you store it in a struct:

julia> struct Foo
           x::Union{String, Nothing}
       end

julia> sizeof(Foo)
8

Then the union type takes no extra memory. Itā€™s simply stored as a pointer either to the Nothing type (I think), or to the actual string.
If you use an array of the kind Vector{Union{String, Nothing}}, then the memory layout will be two dense arrays, one with pointers, and another byte array with 0x00 for Nothing and 0x01 for String, for a total of only 9 bytes per object, which is only 13% more than the 8 bytes a pure String array would take.

No need to! The code looked terrible enough for me to try to find another solutionā€¦

Here we discuss if using uninitialized or partially initialized structs could provide any benefits. Does your technique apply to structs also? Iā€™m nearly tempted to write a micro benchmark.

Thanks for your help!

Yes, you can use pointer_from_objref to get a pointer to a mutable type, and for immutable types, you can put (probably?) them in a Ref, then take a pointer to the Ref. Again, :bomb:

1 Like

Edit: I obviously have to use mutablesā€¦

I tried this

mutable struct Example
    el::String
    Example() = new()
    Example(el) = new(el)
end

B = Example("hello")
unsafe_store!(Ptr{UInt}(pointer_from_objref(B)), zero(UInt), 1);
println(B)

yielding

Example(#undef)

Thanks!

Interesting. IMHO the micro benchmark exactly shows the tradeoffs:

using BenchmarkTools

mutable struct Node1{T}
    value::T
    next::Union{Nothing, Node1{T}}
end
const List1{T} = Union{Nothing, Node1{T}}

function test1()
    list = nothing
    for i in 1:1000
        list = Node1(i, list)
    end
    nodes = []
    s = 0
    while list !== nothing
        s += list.value
        push!(nodes, list)
        list = list.next
    end
    for node in nodes
        node.next = nothing
        @assert node.next === nothing
    end
    s
end
      
mutable struct Node2{T}
    value::T
    next::Node2{T}
    Node2{T}(value) where T = new{T}(value)
    Node2{T}(value, next) where T = new{T}(value, next)
end
const List2{T} = Union{Nothing, Node2{T}}

function test2()
    list = Node2{Int}(1)
    for i in 2:1000
        list = Node2{Int}(i, list)
    end
    nodes = []
    s = 0
    while list !== nothing
        s += list.value
        push!(nodes, list)
        list = isdefined(list, :next) ? list.next : nothing
    end
    for node in nodes
        ptr = convert(Ptr{UInt}, Base.pointer_from_objref(node))
        unsafe_store!(ptr, zero(UInt), 2)
        @assert !isdefined(node, :next)
    end
    s
end

res1 = test1()
res2 = test2()
@assert res1 == res2

@btime test1()
@btime test2()

yielding

  104.300 Ī¼s (1006 allocations: 53.09 KiB)
  24.600 Ī¼s (1006 allocations: 53.09 KiB)

Which type annotations are needed to eliminate dynamic dispatch in the first test case?

Maybe it would be better to have a new function like setasundef! in julia Base.

I donā€™t get the benchmarking - what two, different, functions are you benchmarking?

Inserting into a list, sum the list entries and set all next pointers to nothing or undefined respectively. First data structure uses Union{Nothing, Node1{T}} as pointer to next element, second one uses Node2{T} with undef as null indicator.

Whoops, I didnā€™t see the code box could scroll down :slight_smile:
Anyway, if you solve the type instability by specifying the type of the array, the union type is faster for me:

julia> include("foo.jl")
WARNING: replacing module Foo.
  11.524 Ī¼s (1006 allocations: 53.09 KiB)
  18.020 Ī¼s (1006 allocations: 53.09 KiB)
1 Like

Yes, but this is partly the point: with Union{Nothing,T} you sometimes have to care about type instabilities, see https://github.com/JuliaCollections/DataStructures.jl/pull/766 for more examples.

1 Like