# Efficient recursive iteration over self referencing struct

Suppose i have a `Node` type that may have a reference to itself on one of its fields. What is the most efficient way to implement `Base.iterate` for this `Node` type, such that at every iteration we get the next child `Node`?

``````julia> const Maybe{T} = Union{T, Nothing}
Union{Nothing, T} where T

julia> struct Node{T}
name::String
value::T
child::Maybe{Node}
end

julia> Base.iterate(::Node) = ??
``````

I was trying to overengineer the solution for this problem, and it turns out it is more simpler than i thought, but iām still no sure if thereās a more efficient way, however, this works for now:

``````julia> function Base.length(node::Node)
len = 0
while !isnothing(node)
len += 1
node = node.child
end
return len
end

julia> Base.iterate(node::Node, state::Node = node) = state, state.child

julia> Base.iterate(node::Node, ::Nothing) = nothing

julia> node = Node("x", 1, Node("y", 2, Node("z", 3)))
Node{Int64}("x", 1, Node{Int64}("y", 2, Node{Int64}("z", 3, nothing)))

julia> length(node)
3

julia> collect(node)
3-element Vector{Any}:
Node{Int64}("x", 1, Node{Int64}("y", 2, Node{Int64}("z", 3, nothing)))
Node{Int64}("y", 2, Node{Int64}("z", 3, nothing))
Node{Int64}("z", 3, nothing)

``````

When you say āmore efficientā, what exactly are you imagining the inefficiency to be? There are some ways you can make a linked list more efficient cache-wise, but they require quite a few changes. Just accessing the struct member and checking whether itās nothing is probably as good as you can get with just this approach.

1 Like

By āmore efficientā, i mean fast, a non-naive solution. Iām not good at writing efficient (fast) code. For most of the time in the past i didnāt care much about effiency and optimization, as long as my code was working it was fine, but now i want to be better at write efficient code. So if anybody has a better solution for the problem that i described explaining why is better it would be of much help.

(Sorry for my english: i actually speak portuguese)

I see! Well, the inefficiency of linked lists is inherent to the datastructure. What youāre asking about is how you can make a linked list faster in and of itself, and the āproperā answer to that would be ādonāt use a linked listā (assuming you want to make reading faster). Another complication is that linked lists truly are better for some workloads, but this depends on how you want to use them. There is no One-size-fits-all perfect datastructure.

Are you familiar with Big-O notation?

1 Like

I do not remember seeing a fast linked list implementation, I should write a package with one, but I believe I do not have the time for now.

The fastest way I know to implement a linked list is with two `Vector`, it works very well if the list starts populated and mostly suffer removals, and/or the insertions are few and distributed among the whole list. However, if you need to insert a lot of elements between other elements already present you either do your way or maybe go for `OrderedCollections.jl`.

The two-`Vector` way consists in a data `Vector{T}` which will keep the elements (no initialization is necessary) and a next `Vector{Int}` which points to the next position from the current position. If you want to start with everything populated you just create a data Vector with the elements in order, and `1:length(data)` as the next. It can be circular (the last next can be one) or not (use `0` or other sentinel value in the last next position). If you think you will add elements, you can make it double the size, initialize only the odd data positions, and have next be `[3, 0, 5, 0, 7, ...]`. So if you can put new elements between each pair of current elements and future shifts will the the smallest possible. Is this understandable? This is just a very bare-bones explanation.

1 Like

Interesting, iām know about Data Structures, but iām still noob when it comes to choose which one is better for the job.

Yes, iām familiar.

I think i understand what you mean. One day i watched a video about implementing a GUI library in Rust using Data-Oriented Programming, so instead of creating a wigdet structure with references to its children and parents, there was another structure that behaves as a āmanagerā which stored each widget on a `Vector` and the child of each widget as indices on another `Vector`. At the time i did not understanded correctly, but now it starting to make sense. Thanks for your answer.

Not my best work, but supports sparsity, circularity (given one or more starting elements), and iteration. May implement removal and insertion after.

``````module FastLinkedLists

struct SingleLL{T, N}
data :: Vector{T}
next :: Vector{N}
end

"""
function SingleLL(data, ::Type{N} = Int; kwargs...) where {T, N}

with an empty circular buffer is no supported.
"""
function SingleLL(
data :: Vector{T}, ::Type{N} = Int;
copy_data = true, sparsity_factor :: Int = 1,
circular = false
) where {T, N}
if sparsity_factor < 1
throw(ArgumentError("sparsity_factor should be one or more"))
end
if !copy_data && sparsity_factor > 1
throw(ArgumentError(
"You cannot disable copy_data and have sparsity_factor greater than one" *
", use the default constructor for this kind of fiddling."
))
end
if circular && isempty(data)
throw(ArgumentError(
"You cannot start with an empty circular buffer, the circularity is " *
"engraved into the elements, not the list itself."
))
end
sparsity_factor = convert(N, sparsity_factor)

isempty(data) && return SingleLL{T, N}(copy_data ? copy(data) : data, N[])

if isone(sparsity_factor)
data = copy_data ? copy(data) : data
next = collect(convert(N, 2):convert(N, length(data)))
next[end] = circular ? one(N) : zero(N)
return SingleLL{T, N}(data, next)
end

n = length(data)
m = n * sparsity_factor
data_ = Vector{T}(undef, m)
next = zeros(N, m)
j :: Int, i :: Int = 1, 1
while i <= n
data_[j] = data[i]
next[j] = j + sparsity_factor
j = next[j]
i += 1
end
next[j - sparsity_factor] = circular ? one(n) : zero(n)

return SingleLL{T, N}(data_, next)
end

import Base.iterate
function Base.iterate(l :: SingleLL{T, N}) where {T, N}
isempty(l.next) && return nothing
i = 1
while iszero(l.next[i]); i += 1; end
return (l.data[i], l.next[i])
end

function Base.iterate(l :: SingleLL{T, N}, i) where {T, N}
iszero(i) && return nothing
return (l.data[i], l.next[i])
end

end # module
``````
1 Like

Is the output of `collect` correct? I was expecting to see a list of `(String, Integer)` pairs.

Well, the iteration returns the current `Node`, but if you want to return only the `name` and `value`, you do something like:

``````julia> Base.iterate(node::Node, state::Node = node) = state.name => state.value, state.child

julia> collect(node)
3-element Vector{Any}:
"x" => 1
"y" => 2
"z" => 3
``````

For my specific use case, i realized that having an `Array` where every next `Node` is a child of the previous `Node` is much better than having a linked list, because my situation is much more simpler, since every `Node` can only have one and only one child or no child.