Efficient recursive iteration over self referencing struct

dylanxyz · June 24, 2022, 9:05pm

Suppose i have a Node type that may have a reference to itself on one of its fields. What is the most efficient way to implement Base.iterate for this Node type, such that at every iteration we get the next child Node?

julia> const Maybe{T} = Union{T, Nothing}
Union{Nothing, T} where T

julia> struct Node{T}
           name::String
           value::T
           child::Maybe{Node}
       end

julia> Base.iterate(::Node) = ??

Thanks in advance!

dylanxyz · June 25, 2022, 7:16am

I was trying to overengineer the solution for this problem, and it turns out it is more simpler than i thought, but i’m still no sure if there’s a more efficient way, however, this works for now:

julia> function Base.length(node::Node)
           len = 0
           while !isnothing(node)
               len += 1
               node = node.child
           end
           return len
       end

julia> Base.iterate(node::Node, state::Node = node) = state, state.child

julia> Base.iterate(node::Node, ::Nothing) = nothing

julia> node = Node("x", 1, Node("y", 2, Node("z", 3)))
Node{Int64}("x", 1, Node{Int64}("y", 2, Node{Int64}("z", 3, nothing)))

julia> length(node)
3

julia> collect(node)
3-element Vector{Any}:
 Node{Int64}("x", 1, Node{Int64}("y", 2, Node{Int64}("z", 3, nothing)))
 Node{Int64}("y", 2, Node{Int64}("z", 3, nothing))
 Node{Int64}("z", 3, nothing)

Sukera · June 25, 2022, 7:31am

When you say “more efficient”, what exactly are you imagining the inefficiency to be? There are some ways you can make a linked list more efficient cache-wise, but they require quite a few changes. Just accessing the struct member and checking whether it’s nothing is probably as good as you can get with just this approach.

dylanxyz · June 25, 2022, 7:48am

By “more efficient”, i mean fast, a non-naive solution. I’m not good at writing efficient (fast) code. For most of the time in the past i didn’t care much about effiency and optimization, as long as my code was working it was fine, but now i want to be better at write efficient code. So if anybody has a better solution for the problem that i described explaining why is better it would be of much help.

(Sorry for my english: i actually speak portuguese)

Sukera · June 25, 2022, 8:15am

I see! Well, the inefficiency of linked lists is inherent to the datastructure. What you’re asking about is how you can make a linked list faster in and of itself, and the “proper” answer to that would be “don’t use a linked list” (assuming you want to make reading faster). Another complication is that linked lists truly are better for some workloads, but this depends on how you want to use them. There is no One-size-fits-all perfect datastructure.

Are you familiar with Big-O notation?

Henrique_Becker · June 25, 2022, 3:26pm

I do not remember seeing a fast linked list implementation, I should write a package with one, but I believe I do not have the time for now.

The fastest way I know to implement a linked list is with two Vector, it works very well if the list starts populated and mostly suffer removals, and/or the insertions are few and distributed among the whole list. However, if you need to insert a lot of elements between other elements already present you either do your way or maybe go for OrderedCollections.jl.

The two-Vector way consists in a data Vector{T} which will keep the elements (no initialization is necessary) and a next Vector{Int} which points to the next position from the current position. If you want to start with everything populated you just create a data Vector with the elements in order, and 1:length(data) as the next. It can be circular (the last next can be one) or not (use 0 or other sentinel value in the last next position). If you think you will add elements, you can make it double the size, initialize only the odd data positions, and have next be [3, 0, 5, 0, 7, ...]. So if you can put new elements between each pair of current elements and future shifts will the the smallest possible. Is this understandable? This is just a very bare-bones explanation.

dylanxyz · June 25, 2022, 3:57pm

Interesting, i’m know about Data Structures, but i’m still noob when it comes to choose which one is better for the job.

Yes, i’m familiar.

dylanxyz · June 25, 2022, 4:04pm

I think i understand what you mean. One day i watched a video about implementing a GUI library in Rust using Data-Oriented Programming, so instead of creating a wigdet structure with references to its children and parents, there was another structure that behaves as a ‘manager’ which stored each widget on a Vector and the child of each widget as indices on another Vector. At the time i did not understanded correctly, but now it starting to make sense. Thanks for your answer.

Henrique_Becker · June 25, 2022, 5:08pm

Not my best work, but supports sparsity, circularity (given one or more starting elements), and iteration. May implement removal and insertion after.

module FastLinkedLists

struct SingleLL{T, N}
	data :: Vector{T}
	next :: Vector{N}
end

"""
    function SingleLL(data, ::Type{N} = Int; kwargs...) where {T, N}

Create a linked list with unidirectional links (forward). NOTE: starting
with an empty circular buffer is no supported.
"""
function SingleLL(
	data :: Vector{T}, ::Type{N} = Int;
	copy_data = true, sparsity_factor :: Int = 1,
	circular = false
) where {T, N}
	if sparsity_factor < 1
		throw(ArgumentError("sparsity_factor should be one or more"))
	end
	if !copy_data && sparsity_factor > 1
		throw(ArgumentError(
			"You cannot disable copy_data and have sparsity_factor greater than one" *
			", use the default constructor for this kind of fiddling."
		))
	end
	if circular && isempty(data)
		throw(ArgumentError(
			"You cannot start with an empty circular buffer, the circularity is " *
			"engraved into the elements, not the list itself."
		))
	end
	sparsity_factor = convert(N, sparsity_factor)

	isempty(data) && return SingleLL{T, N}(copy_data ? copy(data) : data, N[])

	if isone(sparsity_factor)
		data = copy_data ? copy(data) : data
		next = collect(convert(N, 2):convert(N, length(data)))
		next[end] = circular ? one(N) : zero(N)
		return SingleLL{T, N}(data, next)
	end

	n = length(data)
	m = n * sparsity_factor
	data_ = Vector{T}(undef, m)
	next = zeros(N, m)
	j :: Int, i :: Int = 1, 1
	while i <= n
		data_[j] = data[i]
		next[j] = j + sparsity_factor
		j = next[j]
		i += 1
	end
	next[j - sparsity_factor] = circular ? one(n) : zero(n)

	return SingleLL{T, N}(data_, next)
end

import Base.iterate
function Base.iterate(l :: SingleLL{T, N}) where {T, N}
	isempty(l.next) && return nothing
	i = 1
	while iszero(l.next[i]); i += 1; end
	return (l.data[i], l.next[i])
end

function Base.iterate(l :: SingleLL{T, N}, i) where {T, N}
	iszero(i) && return nothing
	return (l.data[i], l.next[i])
end

end # module

dylanxyz · June 25, 2022, 5:46pm

Thanks for your answer, this works for me too!

pitsianis · June 25, 2022, 10:39pm

Is the output of collect correct? I was expecting to see a list of (String, Integer) pairs.

dylanxyz · June 25, 2022, 11:48pm

Well, the iteration returns the current Node, but if you want to return only the name and value, you do something like:

julia> Base.iterate(node::Node, state::Node = node) = state.name => state.value, state.child

julia> collect(node)
3-element Vector{Any}:
 "x" => 1
 "y" => 2
 "z" => 3

For my specific use case, i realized that having an Array where every next Node is a child of the previous Node is much better than having a linked list, because my situation is much more simpler, since every Node can only have one and only one child or no child.

Topic		Replies	Views
How do I code a simple linked list using DataStructures.jl New to Julia data_structures	7	2666	October 23, 2022
Iterating over a Tree recursively with Base.iterate General Usage question , iterators , tree	8	2064	June 9, 2021
Using iterator in Julia 0.7 General Usage	2	644	June 8, 2018
How to do custom allocators Performance memory-allocation , struct	22	1529	October 17, 2021
Error iterating MutableLinkedList New to Julia	1	322	October 3, 2019

Efficient recursive iteration over self referencing struct

Related topics