Mutable sorted collections

I don’t understand what the use case for SortedSet is. It seems to only collect and sort keys, with no values associated with those keys.

I have a homogeneous collection of structs (points of a parameterized curve in a three dimensional space), each one of which has a field (the parameter) whose value could serve as a sort key which would allow me to visit the structs in parameter order – a core behavior of my application.

Though I can define an orderiing based on this key value:

struct StrandPointOrdering <: Base.Order.Ordering
end

Base.Order.lt(o::StrandPointOrdering, a, b) = a.p < b.p

I have no way to find an element by that key value, other than by iterating through the entire set.

If I were to use SortedDict instead, the key value would be redundantly stored, and in each reference to the dict, I would need to add extra code to extract my struct from each dict entry Pair.

I looked at NearestNeighbors.jl but it does not allow modification of the data set. There is some appeal to being able to query my data by a region of the [parameter, x, y, z] space, I don’t expect the typical use of my application to have more than hundreds of such structs.

Am I misunderstanding SortedSet and SortedDict? Is there some other sorted mutable collection type I should look at?

Thanks.

What’s so bad about using a dictionary? You said the keys are redundantly stored, but you also said you only have hundreds of them, so I doubt that’s a computational concern.

2 Likes

See my second reason for not using a dictionary: the extra code that needs to be added at every place I extract an element. It was the process of doing that to my codebase that prompted me to create this topic.

I’ll counter with: what’s so bad about SortedSet takiing a key function to be applied to each of its elements for sorting?

Can you say what exactly you’re trying to achieve? If you just want the values from a sorted dict you can do values(dict). And you can specify an ordering to a sorted set with SortedSet(order).

If I understand correctly, you have a struct S with two fields p and q. You want to store these ordered by p, and you want to retrieve an item in the sorted set knowing only p (i.e., look up by p). First, you need to define an ordering, which I see you have already done in your original post. Note that distinct objects must compare unequal, so that if two objects share the same p but have different q's, then you must extend your order to compare q as well, i.e., lt(a,b) = a.p < b.p || (a.p == b.p && a.q < b.q)

To retrieve an object in the collection given just p, use the searchsortedfirst function.

Thanks.

My intent is to not insert! an object whose key is already present.

The documentation for insert! of SortedSet says

Argument sc is a SortedSet and k is a key. This inserts the key into the container.

(though, as much of the Julia documentation Google finds for me, perhaps thus isn’t current). If k is just the key, how does the SortedSet know the struct? If k is the struct itself, how does it know how to get the key? Of do I need to use SortedDict with the frustrations that entails?

The documentation for insert! of `SortedDict’ isn’t selfconsistent:

insert!(sc, k)

Argument sc is a SortedDict or SortedMultiDict, k is a key and v is the corresponding value.

I assume this in an error.

I find that my h if the Julia documentation is deficient in one way or another and that the maintainers are too busy writing code to fix the docs. How might I help with this?

Maybe not much new info here, but I have the feeling that you are a bit lost in the package ecosystem, so I hope this helps:

  • I guess you are using DataStructures.jl, which is an external package with its own documentation. If you are not sure the docs you see is the “official” one (although there is almost always a single one), JuliaHub has a package browser: JuliaHub .

  • Documentation pages (including for base Julia at docs.julialang.org ) typically have a version selector. “dev” is the latest, it may contain changes not released yet. You can check the installed version in the pkg REPL mode with the status command. (press ‘[’ in the REPL before)

(This is not the base Julia documentation, but works the same:) Fixing small documentation issues is usually as easy as clicking “Edit on GitHub” at the page where you see the problem, and going through the process of creating a pull request in the browser.

1 Like

First, thanks for finding the documentation error. Please either correct it or else open an issue in DataStructures so that the maintainers will correct it.

If I understand correctly, you would like to use part of a struct as a key and then modify another part of the struct after a lookup operation to the SortedSet. Commonly, one would use a Dictionary for this purpose, but it is possible also with SortedSet. The datatype must be mutable. However, you should not mutate any part of the struct taken from the SortedSet collection that is used for the sort-order comparison operation because doing so will corrupt the index of the SortedSet. For example, in the snippet below, you would not be allowed to mutate s1.a. Below is an example of how to modify a struct after a lookup in a SortedSet.

julia> using DataStructures
julia> mutable struct S
       a::Int
       b::Int
       end

julia> Base.isless(s1::S, s2::S) = s1.a < s2.a

julia> w = SortedSet{S}()
SortedSet(S[],
Base.Order.ForwardOrdering())

julia> push!(w, S(7,5))
SortedSet{S,Base.Order.ForwardOrdering} with 1 element:
  S(7, 5)

julia> push!(w,S(2,0))
SortedSet{S,Base.Order.ForwardOrdering} with 2 elements:
  S(2, 0)
  S(7, 5)

julia> st = searchsortedfirst(w,S(2,0))
DataStructures.Tokens.IntSemiToken(4)

julia> s1 = deref((w,st))
S(2, 0)

julia> s1.b=7
7

julia> w
SortedSet{S,Base.Order.ForwardOrdering} with 2 elements:
  S(2, 7)
  S(7, 5)