Suppose i have a Node type that may have a reference to itself on one of its fields. What is the most efficient way to implement Base.iterate for this Node type, such that at every iteration we get the next child Node?
julia> const Maybe{T} = Union{T, Nothing}
Union{Nothing, T} where T
julia> struct Node{T}
name::String
value::T
child::Maybe{Node}
end
julia> Base.iterate(::Node) = ??
I was trying to overengineer the solution for this problem, and it turns out it is more simpler than i thought, but iām still no sure if thereās a more efficient way, however, this works for now:
julia> function Base.length(node::Node)
len = 0
while !isnothing(node)
len += 1
node = node.child
end
return len
end
julia> Base.iterate(node::Node, state::Node = node) = state, state.child
julia> Base.iterate(node::Node, ::Nothing) = nothing
julia> node = Node("x", 1, Node("y", 2, Node("z", 3)))
Node{Int64}("x", 1, Node{Int64}("y", 2, Node{Int64}("z", 3, nothing)))
julia> length(node)
3
julia> collect(node)
3-element Vector{Any}:
Node{Int64}("x", 1, Node{Int64}("y", 2, Node{Int64}("z", 3, nothing)))
Node{Int64}("y", 2, Node{Int64}("z", 3, nothing))
Node{Int64}("z", 3, nothing)
When you say āmore efficientā, what exactly are you imagining the inefficiency to be? There are some ways you can make a linked list more efficient cache-wise, but they require quite a few changes. Just accessing the struct member and checking whether itās nothing is probably as good as you can get with just this approach.
By āmore efficientā, i mean fast, a non-naive solution. Iām not good at writing efficient (fast) code. For most of the time in the past i didnāt care much about effiency and optimization, as long as my code was working it was fine, but now i want to be better at write efficient code. So if anybody has a better solution for the problem that i described explaining why is better it would be of much help.
(Sorry for my english: i actually speak portuguese)
I see! Well, the inefficiency of linked lists is inherent to the datastructure. What youāre asking about is how you can make a linked list faster in and of itself, and the āproperā answer to that would be ādonāt use a linked listā (assuming you want to make reading faster). Another complication is that linked lists truly are better for some workloads, but this depends on how you want to use them. There is no One-size-fits-all perfect datastructure.
I do not remember seeing a fast linked list implementation, I should write a package with one, but I believe I do not have the time for now.
The fastest way I know to implement a linked list is with two Vector, it works very well if the list starts populated and mostly suffer removals, and/or the insertions are few and distributed among the whole list. However, if you need to insert a lot of elements between other elements already present you either do your way or maybe go for OrderedCollections.jl.
The two-Vector way consists in a dataVector{T} which will keep the elements (no initialization is necessary) and a nextVector{Int} which points to the next position from the current position. If you want to start with everything populated you just create a data Vector with the elements in order, and 1:length(data) as the next. It can be circular (the last next can be one) or not (use 0 or other sentinel value in the last next position). If you think you will add elements, you can make it double the size, initialize only the odd data positions, and have next be [3, 0, 5, 0, 7, ...]. So if you can put new elements between each pair of current elements and future shifts will the the smallest possible. Is this understandable? This is just a very bare-bones explanation.
I think i understand what you mean. One day i watched a video about implementing a GUI library in Rust using Data-Oriented Programming, so instead of creating a wigdet structure with references to its children and parents, there was another structure that behaves as a āmanagerā which stored each widget on a Vector and the child of each widget as indices on another Vector. At the time i did not understanded correctly, but now it starting to make sense. Thanks for your answer.
Not my best work, but supports sparsity, circularity (given one or more starting elements), and iteration. May implement removal and insertion after.
module FastLinkedLists
struct SingleLL{T, N}
data :: Vector{T}
next :: Vector{N}
end
"""
function SingleLL(data, ::Type{N} = Int; kwargs...) where {T, N}
Create a linked list with unidirectional links (forward). NOTE: starting
with an empty circular buffer is no supported.
"""
function SingleLL(
data :: Vector{T}, ::Type{N} = Int;
copy_data = true, sparsity_factor :: Int = 1,
circular = false
) where {T, N}
if sparsity_factor < 1
throw(ArgumentError("sparsity_factor should be one or more"))
end
if !copy_data && sparsity_factor > 1
throw(ArgumentError(
"You cannot disable copy_data and have sparsity_factor greater than one" *
", use the default constructor for this kind of fiddling."
))
end
if circular && isempty(data)
throw(ArgumentError(
"You cannot start with an empty circular buffer, the circularity is " *
"engraved into the elements, not the list itself."
))
end
sparsity_factor = convert(N, sparsity_factor)
isempty(data) && return SingleLL{T, N}(copy_data ? copy(data) : data, N[])
if isone(sparsity_factor)
data = copy_data ? copy(data) : data
next = collect(convert(N, 2):convert(N, length(data)))
next[end] = circular ? one(N) : zero(N)
return SingleLL{T, N}(data, next)
end
n = length(data)
m = n * sparsity_factor
data_ = Vector{T}(undef, m)
next = zeros(N, m)
j :: Int, i :: Int = 1, 1
while i <= n
data_[j] = data[i]
next[j] = j + sparsity_factor
j = next[j]
i += 1
end
next[j - sparsity_factor] = circular ? one(n) : zero(n)
return SingleLL{T, N}(data_, next)
end
import Base.iterate
function Base.iterate(l :: SingleLL{T, N}) where {T, N}
isempty(l.next) && return nothing
i = 1
while iszero(l.next[i]); i += 1; end
return (l.data[i], l.next[i])
end
function Base.iterate(l :: SingleLL{T, N}, i) where {T, N}
iszero(i) && return nothing
return (l.data[i], l.next[i])
end
end # module
For my specific use case, i realized that having an Array where every nextNode is a child of the previousNode is much better than having a linked list, because my situation is much more simpler, since every Node can only have one and only one child or no child.