Is it possible to traverse a tree without allocations

I have defined a tree made of ‘nodes’ and ‘leaves’ as follows. Each leaf has a vector of indices that I would like to iterate under certain conditions.

using AbstractTrees

struct BVH
    name::String
    children::Union{Nothing, Tuple{BVH, BVH}}
    indices::Union{Nothing, Vector{Int64}}
end

BVH(n::String, l::BVH, r::BVH) = BVH(n, (l,r), nothing)        # construct a node
BVH(n::String, inds::Vector{Int64}) = BVH(n, nothing, inds)    # construct a leaf

Base.show(io::IO, b::BVH) = print(io, isnothing(b.children) ? "Leaf-" : "Node-", b.name)
AbstractTrees.children(bvh::BVH) = isnothing(bvh.children) ? [] : bvh.children

function indices(b::BVH)
    isnothing(b.children) && return (i for i in b.indices)
    isnothing(b.indices) && return Iterators.flatten(indices(c) for c in b.children)
end

bvh = BVH("root", BVH("1",BVH("2", [1,2,3]),BVH("3", BVH("4", [4,5,6]), BVH("5",[7,8,9]))),BVH("6",[10,11,12]))

I do get with this example 2 allocations.

julia> print_tree(bvh)
Node-root
├─ Node-1
│  ├─ Leaf-2
│  └─ Node-3
│     ├─ Leaf-4
│     └─ Leaf-5
└─ Leaf-6

julia> collect(indices(bvh))
12-element Vector{Int64}:
  1
  ...
 12

julia> @btime indices($bvh)
  23.416 ns (2 allocations: 64 bytes)

Of course @code_warntype shows me that the problem has to do with the different kind of generator I am returning.
Union{Base.Generator{Nothing, typeof(identity)}, Base.Generator{Vector{Int64}, typeof(identity)}}

Is there a any way to achieving the goal without allocations? Thanks.

Yes, you can traverse the tree without allocations (seems you already do), it’s just there since you return something that needs to be allocated (since it’s variable-length).

function indices_1_alloc(b::BVH)
           isnothing(b.children) && return (i for i in b.indices)
           isnothing(b.indices) && return (indices(c) for c in b.children)
           nothing # this line may be needed, at least makes @code_warntype slightly different
end

Why do you need to return an empty Vector{Any} here, instead of just returning nothing?

I suppose since it seemed to work (while a non-obvious performance, only, bug), and your way (correct?):

julia> AbstractTrees.children(bvh::BVH) = isnothing(bvh.children) ? nothing : bvh.children

julia> print_tree(bvh)
Node-root
├─ Node-1
│  ├─ Leaf-2
ERROR: MethodError: no method matching iterate(::Nothing)

This is only for the AbstractTrees printing. It needs to return a iterateable and nothing is not iterateable.

Did you try returing an empty tuple instead? That should be iterable, and perhaps have better performance (I’m not sure, though.) It also seems to make more sense, then you would would have either NTuple{2, BVH} or NTuple{0, BVH} (the same as the empty tuple.)

(I don’t know much about trees in Julia, but just recently found a need to start learning about them.)

Thanks. I guess indices in the 3rd line should be indices_1_alloc, no?
But in one case it returns a generator of Int64 and in the other a generator of generators. Isn’t it? This is why I had the flatten.

Perhaps my problem is further down to the code using indices(...). Since it returns different types does not know that indeed iterating on them will always return a Int64. The effect is type instability and when using the index on vectors it always returns Any. I would guess that solution would be preallocating a Vector{Int64} and passing it to indices(...). But this would imply a lot of copying of indices.

function indices(b::BVH)
    isnothing(b.children) && return (i for i in b.indices::Vector{Int64})
    isnothing(b.indices) && return Iterators.flatten(indices(c) for c in b.children::Tuple{BVH, BVH})
end

gets rid of one allocation. Of course your indices function still won’t be type stable.

One thing is the efficiency of getting the indices lazily, but when collecting them it seems quite slow:

@btime indices($bvh) |> collect
  6.040 μs (94 allocations: 4.69 KiB)

Using a non-lazy implementation, I can make it far more efficient to actually get the materialized indices (apologies for making several changes to your code, I am experimenting a bit with the trees for learning purposes, so there might be more than one single change causing the speedup):

struct BVH
    name::String
    children::Union{Tuple{}, Tuple{BVH, BVH}}
    indices::Vector{Int64}
end


BVH(n::String, l::BVH, r::BVH) = BVH(n, (l,r), Int[])        # construct a node
BVH(n::String, inds::Vector{Int64}) = BVH(n, (), inds)    # construct a leaf

Base.show(io::IO, b::BVH) = print(io, isempty(b.children) ? "Leaf-" : "Node-", b.name)
AbstractTrees.children(bvh::BVH) = bvh.children

function _indices!(ind::Vector{Int}, b::BVH)
    isempty(b.children) && return append!(ind, b.indices)
    _indices!(ind, b.children[1])
    _indices!(ind, b.children[2])
    return ind
end
indices(b::BVH) = _indices!(Int[], b)

Benchmark:

julia> bvh = BVH("root", BVH("1",BVH("2", [1,2,3]),BVH("3", BVH("4", [4,5,6]), BVH("5",[7,8,9]))),BVH("6",[10,11,12]))

julia> @btime indices($bvh)
  147.188 ns (3 allocations: 480 bytes)
12-element Vector{Int64}:
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12

Thanks for the suggestion to be be less ‘lazy’ and more concrete. Indeed I can even allocate a vector outside the function and get rid of all allocations with the overhead of copying all indexes to the vector. For a small tree and small number of indexes works nicely

julia> indices!(ind::Vector{Int64}, b::BVH) = return _indices!(empty!(ind), b)

julia> ind = zeros(Int64,100)
100-element Vector{Int64}

julia> @btime indices!($ind, $bvh)
  81.125 ns (0 allocations: 0 bytes)

Of course, increasing the number of indexes then the time scales linearly. Too reach 6 μs I would need many thousands of indices to copy.