Best way to store a sparse, unbalanced, and not complete binary tree

AlessioLapolla · January 20, 2022, 4:49am

I am implementing a binary tree where given the key of the root and a function the left and right children are defined up to a given depth that is an other parameter of the constructing function: create_tree!(tree, key, func, depth) . The most straightforward way in my opinion is to define tree as a Vector{Union{Int,Nothing} . This solution is clear, in my opinion. However in my case func is such that if depth=15 around 97% of the nodes in tree happen to be nothing . Since I would like to avoid wasting memory and I need even deeper trees I am asking: is there a good way to implement such a binary tree in Julia? To give more context I can also add that func is such that is more probable that the left child of a given node is nothing with respect to the right child. In a language like C or C++ I would use pointer based construction of a tree. I suppose I could define a Julia type that acts like a pointer but I am wondering if there is a easier and more straightforward solution.

goerch · January 20, 2022, 5:11am

I’d probably use

mutable struct Node{T}
    value::T
    left::Union{Nothing, Node{T}}
    right::Union{Nothing, Node{T}}
end
const Tree{T} = Union{Nothing, Node{T}}

guoyongzhi · January 21, 2022, 3:13am

Agree with @goerch’ answer, and we can do away with Union{Nothing, Node{T}} by incomplete initialization, which makes it acts more like a pointer thing.

mutable struct TreeNode{T}
    value::T
    parent::TreeNode{T}
    left::TreeNode{T}
    right::TreeNode{T}
    function TreeNode{T}() where T
        n = new{T}()
        n.parent = n
        n.left = n
        n.right = n
        n
    end
    TreeNode{T}(v, p, l, r) where T = new{T}(v, p, l, r)
end

With this definition, some basic functions of the linked-binary-tree need to be redefined, such as isroot(n::TreeNode) = n === n.parent. I implemented a linked-quadtree type in my project. Things are similar. I hope it will help you.

goerch · January 21, 2022, 10:40am

Interesting. On one hand I can imagine that using undef that way could simplify type inference. OTOH you need to use isassigned(p) instead of p !== nothing and worse to reset p to undef you’d need something like unsafe_store! as @jakobnissen pointed out recently on slack.

foobar_lv2 · January 21, 2022, 11:45am

Depends on the application, as always.

If you don’t need to modify the tree structure after construction, I’m a big fan of the following layout. It is optimized for the case that you need to traverse left more often than right (following left typically hits the same cache-line that you already have loaded), so you may need to internally rename left-right. Only existing nodes are stored (no nothing stuff). There is no depth-bound, you simply have a bunch of nodes.

The structure is

struct Tree{T}
keys::Vector{T}
child_right::Vector{Int} #or Int32 or UInt32...
end

The order is:

[root, L, LL, ...., LR, LRL, LRLL, ..., R, RL, ...]

(in other words, attach to each node the ‘L’/‘R’ string corresponding to the path in the tree to reach the node; then sort lexicographically; for each node, store as metadata the offset to (or position of, depending on what you find more convenient) string_label(node) * "R" if that node exists in the tree and 0 otherwise)

Subtrees are always sequential in memory and their size is known without further searching. Searching for keys is always accessing memory in sequential ascending order, unless you’re e.g. searching for a best-match. When the remaining subtree is small enough, you can seamlessly switch to linear search.

It is quite useful in practice to also include parent pointers for convenience.

Structure is also good for easily parallelized in-place construction or computing and storing tree-stuff for every node (spawn a task for the right subtree if size above threshold; due to the memory layout you very rarely have slowdowns due to different cores writing to the same cacheline).

PS. You must identify subtrees by the pair (subtree_root_index, subtree_upper_bound). The upper bound is computed during traversal from root. Otherwise you cannot distinguish the situations where a node has no children at all, or no right child but a nonempty left subtree.

If you include extra metadata (e.g. parent pointers, subtree_upper_bound, domain-specific info) then it is advisable to use something like

struct NodeMetadata
child_right::Int
#e.g. parent::Int
...
end

struct Tree{T}
data::Vector{T}
metadata::Vector{NodeMetadata}
end

for the sake of better memory access.

I advise against attempting to mix data and metadata – that opens a big can of structure padding worms, makes SIMD harder and makes it harder to use domain-specific implementation for linear search on small subtrees.

goerch · January 21, 2022, 1:50pm

Is this the vine data structure representing a tree traversal?

foobar_lv2 · January 21, 2022, 2:19pm

It’s original (in the sense that I once needed something like this and came up with it), but almost certainly not novel. It’s the kind of stuff that is easier to reinvent or ask discourse / stackoverflow / “oral communications” about than doing a literature search for.

If you happen to know a famous name for this layout, please share! A wiki link could save me some source code comments and/or pages of explanations in papers that use this.

The vine stuff you linked to looks different? The wiki talks about “rebalancing”, while I only gave a memory layout and metadata style for a given tree. How do build it depends on your domain – there are very good domain-specific reasons for unbalanced trees. E.g. if your datapoints have probability mass then you want your tree to be balanced with respect to mass, not with respect to number-of-points. If your datapoints live in a metric space, you may want to balance your tree by diameter, not number-of-points (that was my domain). Either way you end up with a tree that is horribly unbalanced for good reasons and that must not be rebalanced according to naive “number of points”.

Hence the above layout takes some pains to not be worse than a flat array (no linked-list heap-vomit with large overheads in case of completely degenerate tree, modest metadata overhead, etc).

guoyongzhi · January 22, 2022, 3:47am

Can you provide a link to @jakobnissen’s proposal?
In fact, I initialized the node’s field with itself instead of leaving them undefined. I wonder if there is any performance difference in accessing uninitialized fields. Or is there always a protective check, whether it is initialized or not?

goerch · January 22, 2022, 10:19am

Here is @jakobnissen 's example

julia> A = ["hello"]
1-element Vector{String}:
 "hello"

julia> unsafe_store!(Ptr{UInt}(pointer(A, 1)), zero(UInt));

julia> A
1-element Vector{String}:
 #undef

jakobnissen · January 22, 2022, 10:47am

I almost feel bad unleasing such shady code into the world!
You should make sure Union{String, Nothing} isn’t good enough first.

If you store it in a struct:

julia> struct Foo
           x::Union{String, Nothing}
       end

julia> sizeof(Foo)
8

Then the union type takes no extra memory. It’s simply stored as a pointer either to the Nothing type (I think), or to the actual string.
If you use an array of the kind Vector{Union{String, Nothing}}, then the memory layout will be two dense arrays, one with pointers, and another byte array with 0x00 for Nothing and 0x01 for String, for a total of only 9 bytes per object, which is only 13% more than the 8 bytes a pure String array would take.

goerch · January 22, 2022, 10:55am

No need to! The code looked terrible enough for me to try to find another solution…

Here we discuss if using uninitialized or partially initialized structs could provide any benefits. Does your technique apply to structs also? I’m nearly tempted to write a micro benchmark.

Thanks for your help!

jakobnissen · January 22, 2022, 11:07am

Yes, you can use pointer_from_objref to get a pointer to a mutable type, and for immutable types, you can put (probably?) them in a Ref, then take a pointer to the Ref. Again,

goerch · January 22, 2022, 12:13pm

Edit: I obviously have to use mutables…

I tried this

mutable struct Example
    el::String
    Example() = new()
    Example(el) = new(el)
end

B = Example("hello")
unsafe_store!(Ptr{UInt}(pointer_from_objref(B)), zero(UInt), 1);
println(B)

yielding

Example(#undef)

Thanks!

goerch · January 22, 2022, 12:43pm

Interesting. IMHO the micro benchmark exactly shows the tradeoffs:

using BenchmarkTools

mutable struct Node1{T}
    value::T
    next::Union{Nothing, Node1{T}}
end
const List1{T} = Union{Nothing, Node1{T}}

function test1()
    list = nothing
    for i in 1:1000
        list = Node1(i, list)
    end
    nodes = []
    s = 0
    while list !== nothing
        s += list.value
        push!(nodes, list)
        list = list.next
    end
    for node in nodes
        node.next = nothing
        @assert node.next === nothing
    end
    s
end
      
mutable struct Node2{T}
    value::T
    next::Node2{T}
    Node2{T}(value) where T = new{T}(value)
    Node2{T}(value, next) where T = new{T}(value, next)
end
const List2{T} = Union{Nothing, Node2{T}}

function test2()
    list = Node2{Int}(1)
    for i in 2:1000
        list = Node2{Int}(i, list)
    end
    nodes = []
    s = 0
    while list !== nothing
        s += list.value
        push!(nodes, list)
        list = isdefined(list, :next) ? list.next : nothing
    end
    for node in nodes
        ptr = convert(Ptr{UInt}, Base.pointer_from_objref(node))
        unsafe_store!(ptr, zero(UInt), 2)
        @assert !isdefined(node, :next)
    end
    s
end

res1 = test1()
res2 = test2()
@assert res1 == res2

@btime test1()
@btime test2()

yielding

  104.300 μs (1006 allocations: 53.09 KiB)
  24.600 μs (1006 allocations: 53.09 KiB)

Which type annotations are needed to eliminate dynamic dispatch in the first test case?

guoyongzhi · January 23, 2022, 4:58am

Maybe it would be better to have a new function like setasundef! in julia Base.

jakobnissen · January 23, 2022, 11:46am

I don’t get the benchmarking - what two, different, functions are you benchmarking?

goerch · January 23, 2022, 11:53am

Inserting into a list, sum the list entries and set all next pointers to nothing or undefined respectively. First data structure uses Union{Nothing, Node1{T}} as pointer to next element, second one uses Node2{T} with undef as null indicator.

jakobnissen · January 23, 2022, 12:04pm

Whoops, I didn’t see the code box could scroll down
Anyway, if you solve the type instability by specifying the type of the array, the union type is faster for me:

julia> include("foo.jl")
WARNING: replacing module Foo.
  11.524 μs (1006 allocations: 53.09 KiB)
  18.020 μs (1006 allocations: 53.09 KiB)

goerch · January 23, 2022, 12:09pm

Yes, but this is partly the point: with Union{Nothing,T} you sometimes have to care about type instabilities, see https://github.com/JuliaCollections/DataStructures.jl/pull/766 for more examples.

Topic		Replies	Views
Efficiently traverse binary tree represented as tuple of tuples Performance memory-allocation , benchmark	9	1921	August 7, 2019
Implementing a Trie General Usage question , data_structures , tree	6	404	May 19, 2023
Approaches to sentinel values in immutables General Usage question	71	622	June 9, 2025
Default constructor for any type? General Usage question	94	1898	February 29, 2024
Persistent tree data structure as an alternative to Rust-like borrow checker General Usage	9	322	January 7, 2025

Best way to store a sparse, unbalanced, and not complete binary tree

Related topics