Base.Iterators.partition for Dict and Set

Stephen_Vavasis · July 9, 2023, 11:53pm

A recent blog post highlighted the usage of Base.Iterators.partition for correctly managing state in a multithreaded code. I checked how this function behaves on Dict and Set and found that it copies the data (see trace below) instead of creating a lazy structure that can iterate directly over the entries of the dictionary or set. I suppose that a lazy structure would be more performant for most applications. In the future, I may also implement efficient Base.Iterators.partition methods for SortedDict, SortedSet, and SortedMultiSet in DataStructures.jl. Therefore, I am wondering:

Why does Base copy the data for this operation on Dict and Set?
Would it be a breaking change to reimplement Base.Iterators.partition lazily for Dict and Set instead of copying? For most usages, the change would be invisible, but in some odd cases like changing the data structure while iterating over it, this change could break a user’s code.

julia> s = Set(1:9);

julia> u = Base.Iterators.partition(s,4);

julia> for i in u
       println(i, " ", typeof(i))
       end
[5, 4, 6, 7] Vector{Int64}
[2, 9, 8, 3] Vector{Int64}
[1] Vector{Int64}

Topic		Replies	Views
(Efficiently) construct a Dictionary from a Set Performance dictionary , memory , memory-allocation , set , type-stability	15	1020	July 14, 2022
How to access partitions used in previous iterations of a loop over a partitioned set? New to Julia question , loops , iterators	10	496	March 16, 2023
Why is deepcopy() -ing Sets slower than Arrays of the same size? Performance deepcopy	4	608	January 8, 2021
sort(keys(Dict(:a => 1, :b => 2))) throws MethodError General Usage question	13	570	April 20, 2023
What is Base.KeySet()? New to Julia	3	2323	October 30, 2020

Base.Iterators.partition for Dict and Set

Related topics